Why is a Hash Table not used instead of a B-Tree in order to access data inside a database?

Technology Community › Category: Trees › Why is a Hash Table not used instead of a B-Tree in order to access data inside a database?

0 Vote Up Vote Down

VietMX Staff asked 4 years ago

Problem

In MySQL, an index type is a b-tree, and access an element in a b-tree is in logarithmic amortized time O(log n). On the other hand, accessing an element in a hash table is in O(1). Why is a hash table not used instead of a b-tree in order to access data inside a database?

You can only access elements by their primary key in a hashtable. This is faster than with a tree algorithm (O(1) instead of log(n)), but you cannot select ranges (everything in between x and y). Tree algorithms support this in Log(n) whereas hash indexes can result in a full table scan O(n).

Also the constant overhead of hash indexes is usually bigger (which is no factor in theta notation, but it still exists). Also tree algorithms are usually easier to maintain, grow with data, scale, etc.

Hash indexes work with pre-defined hash sizes, so you end up with some “buckets” where the objects are stored in. These objects are looped over again to really find the right one inside this partition.

The size of a database table is not known in advance so the table must be rehashed now and then to get optimal performance out of an hashtable. The rehashing is also expensive.

Question Tags: Senior Level, Trees