0

I am trying to learn PostgreSQL MVCC architecture. It says that MVCC creates a separate snapshot for each concurrent query. Isn't this approach memory inefficient?

For example if there are 1000 concurrent queries and table size is huge. This will create multiple instances of the table.

Is my understanding correct?

code0079
  • 147
  • 3
  • 13

1 Answers1

2

It says that MVCC creates a separate snapshot for each concurrent query. Isn't this approach memory inefficient?

You could argue it is memory inefficient. It usually isn't a big problem in practise.

For example if there are 1000 concurrent queries and table size is huge.

Why would you have/want 1000 concurrent queries? Do you have 1000 CPUs? If there is a risk that you will try to establish 1000 concurrent queries, then you should deploy some entry control mechanism (like a connection pooler) that prevents this from happening, with a fallback to max_connections.

This will create multiple instances of the table.

A snapshot is not a copy of the table. Is just a set of information that gets applied to the base table rows dynamically to decide which rows are visible in that snapshot. The size of a snapshot is proportional to number of concurrent transactions (one reason not have 1000 of them), not to the size of the table.

jjanes
  • 37,812
  • 5
  • 27
  • 34
  • this is a great explanation. Just one doubt , Is the max_connection attribute of database server? – code0079 Feb 23 '20 at 16:25
  • and what is used in postgresql, locking or mvcc? – code0079 Feb 23 '20 at 16:30
  • Yes, max_connections is a database parameter set in postgresql.conf. PostgreSQL uses both locking and MVCC. Row locks block concurrent writers, but do not block pure readers (i.e. ones without FOR UPDATE). I don't think you can have an MVCC with absolutely no locks. – jjanes Feb 23 '20 at 16:50
  • Only the changes or new rows are buffered as new versions so it isn't really a snapshot of the whole table, it just looks like one because when you do a query, you can only "see" data that has a timestamp that is less than or equal to your transactions timestamp. – Samuel Squire Jul 03 '21 at 08:00