Do RDBs like MySQL and PostgreSQL store copy of data for each index? Or just B-tree with links to real objects?

Question

How RDBs like MySQL and PostgreSQL manage memory for new indexes?

My guess that RDB creates B-tree (or other indexes) with References/Links to real objects in memory. Another guess that it duplicates all the data for each new index.

So basically this question is about "What B-tree consists of? References, or real objects?"

Google search is too overheat about DB topics and RDMS products. So, I also would be very grateful for good articles about this.

Each RDBMS has it own internal structure for index. So while indexes contain *links* to the actual data the details vary. — Belayer, May 19 '21 at 16:55

score 1 · Accepted Answer · answered May 19 '21 at 17:07

The details vary, but a B-tree index is a tree structure that is stored on disk. It contains duplicates of the indexed terms (the index keys) and a (direct or indirect) pointer to the indexed row in the table.

A B-tree index represents a sorted list of the index keys that allows fast searches. The tree structure speeds up searching through the list and allows inserting and deleting entries without too much data churn.

It is unclear what you mean by a "real object". The index keys are certainly real, and they are stored in the index. But if you mean the whole table row, that is only referenced from the index.

Thank you, that is exactly what I was looking for – Игорь Кравченко May 19 '21 at 17:36 — Игорь Кравченко, May 19 '21 at 17:36

score 1 · Answer 2 · answered May 21 '21 at 00:55

For MySQL's Engine=InnoDB, it works this way:

The data is stored in PRIMARY KEY order in a B+Tree. This makes lookups and ranges based on the PK very efficient.

Each secondary keys is also a B+Tree, but ordered by the order given by the secondary key column(s). Each "row" also has the columns of the PK, thereby providing the reference (link) to the data's BTree.

If the columns of the secondary key plus the PK are the only columns you need, then the query is performed using only the secondary key's BTree.

There is no "ROWNUM" as found in some other database brands.

If you don't hit certain limits, you could include all the table's columns in a secondary index.

Do RDBs like MySQL and PostgreSQL store copy of data for each index? Or just B-tree with links to real objects?

2 Answers2