0

I know MongoDB WiredTiger use clustered index to store data. Is WiredTiger use clustered index on _id field or another key generate by WiredTiger?

Grug
  • 3
  • 1

2 Answers2

1

WiredTiger uses a binary tree-like structure to store documents. It is a basic key-value store, where the key is an internally generated identifier, and the value is the document.

All indexes, including the one on the _id field, map field values to the internal identifier.

Joe
  • 25,000
  • 3
  • 22
  • 44
  • The index on _id is B-Tree or B+Tree? – Grug Mar 09 '21 at 09:24
  • Yes, something like that. – Joe Mar 09 '21 at 10:47
  • I actually want to know what type of tree "index on _id" uses, B-Tree or B+Tree? – Grug Mar 09 '21 at 10:53
  • It is similar to a B+ tree. – Joe Mar 09 '21 at 11:29
  • You are right! I find RecordId(the internal identifier) in MongoDB WiredTiger source code, and the _id index's value is RecordId, so the index on the _id field is no clustered index. – Grug Mar 10 '21 at 10:38
  • I am trying to continue reading the source code, and found from the source code whether the index structure is a B tree or a B+ tree. But I still haven't found any answer from source code. Can you give me some pointers on the source code? – Grug Mar 10 '21 at 10:40
0

Yes they are clustered and stored in wiredtiger files. For each index defined on collection, wiredtiger creates and manages separate index file.

"Prior to MongoDB 3.2, only B-tree was available in the storage layer. To increase its scalability, MongoDB added LSM Tree in later versions after it acquired WiredTiger" [1]

An LSM tree can provide better performance when we have a workload of random inserts that would otherwise overflow our page cache and start paging in data from disk to keep our index up to date.

To override default wiredtiger storage type configuration:

mongod --wiredTigerIndexConfigString "type=lsm,block_compressor=zlib"

Ali Can
  • 564
  • 3
  • 15