3

I know that the physical storage in PostgreSQL looks like:

heap table:
<old_tuple, t_xmin, t_xmax>
<new_tuple, t_xmin, t_xmax>
index:
<old_index_value, old_RID>
<new_index_value, new_RID>

So Index-Only Scan need the help of Visibility Map.

My question is: Why not we store the t_xmin, t_xmax in index as well?

like:

index:
<old_index_value, old_RID, t_xmin, t_xmax>
<new_index_value, new_RID, t_xmin, t_xmax>
Yriuns
  • 755
  • 1
  • 8
  • 22
  • You are asking about a very specific implementation detail. I'm not sure how many people understand such details of postgres, certainly I don't. Have you tried contacting postgres devs directly? – freakish Feb 13 '21 at 09:42
  • Try pgsql-hackers, that's where the core developers live: https://www.postgresql.org/list/pgsql-hackers/ – Frank Heikens Feb 13 '21 at 10:46

1 Answers1

4

There is relatively high overhead - t_xmin, t_xmax has 8 bytes together, and probably will have 16 bytes in future. So if Postgres stores these values to index, then almost all numeric indexes will be 2 (bigint) times or 2/3 times (int) bigger.

Today it's not an problem (probably), but Postgres beginnings are in half of 80 years, and there disc capacity was big problem.

Second motivation is, probably, complexity of code and ensuring data consistency (without hard locking). Indexes in Postgres was data access accelerators, not source of data. Then the implementation is more simpler. Ingres was designed by very smart professors and students and more robust less complex (but possible slower) design was preferred.

Pavel Stehule
  • 42,331
  • 5
  • 91
  • 94
  • Also, that would be redundant information, so another possible source of inconsistency. – Laurenz Albe Feb 13 '21 at 16:48
  • 1
    @LaurenzAlbe - sure, but this is true about any access optimization - indexes are redundant too (and anybody knows how bad are inconsistent index), and visibility map too (little bit). I think so this design is very old - maybe from Ingres - so the motivation and design principles can be found in Ingres papers and specifications wrote by Stonebraker and his team. – Pavel Stehule Feb 13 '21 at 18:21
  • Also if we store `t_xmax` in the index, we will have to update it. Imagine we have multiple indices containing some tuple, then when it is updated/deleted, we have to go over all those indices and update `t_xmax` there. Seems like too much overhead (NB: thats just my thoughts, please correct me if Im wrong) – Flame239 Jan 06 '23 at 12:41