In MySQL InnoDB or lots of other database engines, the primary key is implemented with clustered index. However after searching with secondary index, the engine must look up into clustered index with primary keys provided in secondary index(if there is no covering index).
InnoDB uses B+ tree for its clustered index, it is a structure with O(log n)
complexity in searching, so we can summerize the procedure like the following:
- Using clusterd index:
One pass, Cost
O(n)
. - Using secondary index:
Two passes. The first pass cost
O(log n)
an result inm
records. Then the second pass costO(log n)
for each of them
records, so the time complexity will bem*O(log n)
.
I know when using hasing, the time complexity in seaching can be reduced to O(1)
, so I am wondering why these database engines prefer using B+ tree rather than hasing techniques(e.g. build a KV store)? Is it because of records are stored on disk rather than in memory?
Meanwhile, I have another question, some other databases, like RocksDB, use KV storage rather than B+ tree. Why they use that?
EDIT
I want to make the question more clearly. I find many tables are designed with auto increment
PK, rather than using something with actual meaning, like phone number or IP. So B+ tree's advantage is not fully exploited. For example, B+ tree is good at searching data in range, but I searching a auto increment
PK in range is rare in practice.