Understanding auto-vacuum and when it is triggered

Question

We've noticed one of our tables growing considerably on PG 12. This table is the target of very frequent updates, with a mix of column types, including a very large text column (often with over 50kb of data) - we run a local cron job that looks for rows older than X time and set the text column to a null value (as we no longer need the data for that particular column after X amount of time).

We understand this does not actually free up disk space due to the MVCC model, but we were hoping that auto-vacuum would take care of this. To our surprise, the table continues to grow (now over 40gb worth) without auto-vacuum running. Running a vacuum manually has addressed the issue and we no longer see growth.

This has lead me to investigate other tables, I'm realising that I don't understand how auto-vacuum is triggered at all.

Here is my understanding of how it works, which hopefully someone can pick apart:

I look for tables that have a large amount of dead tuples in them: select * from pg_stat_all_tables ORDER BY n_dead_tup desc;
I identify tableX with 33169557 dead tuples (n_dead_tup column).
I run a select * from pg_class ORDER BY reltuples desc; to check how many estimated rows there are on table tableX
I identify 1725253 rows via the reltuples column.
I confirm my autovacuum settings: autovacuum_vacuum_threshold = 50 and autovacuum_vacuum_scale_factor = 0.2
I apply the formula threshold + pg_class.reltuples * scale_factor, so, 50 + 1725253 * 0.2 which returns 345100.6

It is my understanding that auto-vacuum will start on this table once ~345100 dead tuples are found. But tableX is already at a whopping 33169557 dead tuples!, The last_autovacuum on this table was back in February.

Any clarification would be welcome.

How many workers are active? https://www.postgresql.org/docs/12/maintenance.html — Frank Heikens, May 10 '21 at 12:18

Laurenz Albe · Accepted Answer · 2021-05-10T14:46:18.663

2

Your algorithm is absolutely correct.

Here are some reasons why things could go wrong:

autovacuum runs, but is so slow that it never gets done

If you see no running autovacuum, that is not your problem.
autovacuum runs, but a long running open transaction prevents it from removing dead tuples
other tables need to be vacuumed more urgently (to avoid transaction ID wraparound), so the three workers are busy with other things
autovacuum runs, but conflicts with high concurrent locks on the table (LOCK TABLE, ALTER TABLE, ...)

This makes autovacuum give up and try again later.
autovacuum is disabled, perhaps only for that table

edited May 10 '21 at 14:46

answered May 10 '21 at 13:26

Laurenz Albe

209,280
17
206
263

Is there anything I could look for in the logs (or any sort of log level requirement) for us to see why autovacuums are failing? – joe May 10 '21 at 14:37
1

You can set `log_autovacuum_min_duration` to 0. – Laurenz Albe May 10 '21 at 14:45
This was a while ago, but we have another application within our systems that has the same exact behaviour as described here (the feature implementation is shared between projects), same table structure, same exact behaviour - The only difference is that it runs on PG 13, whereas the version of this question is 12. I can see autovacuum kicking very frequently on the PG13 table version, whereas the PG12 table hadn't been autovacuumed in in months... Do you know if something was addressed between these major versions? – joe May 28 '21 at 21:19
1

Yes, I added the parameters `autovacuum_vacuum_insert_threshold` and `autovacuum_vacuum_scale_factor`. – Laurenz Albe May 29 '21 at 07:09

Understanding auto-vacuum and when it is triggered

1 Answers1

Related