4

We have a very large table that was partitioned into monthly tables. We have no autovacuum parameters set in the postgresql.conf file, so it's on by default with default parameters.

The past months tables table_201404, table_201403 do not get written to or updated/deleted once they are passed, they are only read from for historical data. Why is it that we are noticing autovacuum processes running on these tables? Is it because they are part of a main partition and PostgreSQL is seeing those tables as one?

We are toying with the idea of setting autovacuum_enabled to off for these past tables, but I wanted to consult the wisdom of Stackoverflow first.

Thanks all...

Vao Tsun
  • 47,234
  • 13
  • 100
  • 132
dsmorey
  • 453
  • 1
  • 5
  • 16
  • Is it a problem that the vacuum deamon checks those tables? Does it impact your performance? Can you measure that impact? –  May 08 '14 at 13:29
  • 1
    Not sure how to measure the impact, but these tables, even being partitioned, are very large and we think performance is being hindered by these autovacuum processes. – dsmorey May 08 '14 at 13:31
  • 1
    "*we think performance is being hindered*" - so you have no proof nor any sign of that? –  May 08 '14 at 13:33
  • 1
    Sorry, wasn't sure if you meant I measured somehow specifically. What I do is run a query for current_processes. Sometimes I'll notice there are two or three autovacuum's running and then I'll notice our Disk IO is almost double what it should be for that particular time of the day. – dsmorey May 08 '14 at 23:27
  • 1
    I'll give you a concrete example. I have a series of queries that is run every night at this time. Usually takes about 25 minutes without autovacuum. Right now it's almost an hour and it's taking that long because two tables are being autovacuumed. These tables are very old data and will never be touched, not sure why autovacuum would need to run on these tables? – dsmorey May 09 '14 at 01:25

1 Answers1

5

Even read-only tables need to be vacuumed for wrap-around once every 2 billion transactions, and under the default settings are vacuumed for wrap-around once every 150 million transactions.

The transaction IDs stored with each row are 32 bits, so they wrap around eventually. To prevent this from causing problems, any very old transactionID has to be replaced with a magic value meaning "Older than all other IDs". So the table has to be scanned to do that replacement. If the table never changes, eventually every transaction ID will be replaced with the magic value and conceptually that table no longer needs to be scanned. But that fact is not stored anywhere, so the table still needs to be scanned every now then so that the system can observe that they are all still OK. Fortunately the scan is done in sequentially and only needs to read, not write, so it should be fairly efficient.

It is possible that the whole thing will be redone in 9.5 so that tables like that would no longer need to be scanned.

jjanes
  • 37,812
  • 5
  • 27
  • 34