I have db table which has around 5-6 Mn entries and it is taking around 20 minutes to perform vacuuming. Since, one field of this table is updated very frequently, thereare a lot of dead rows to deal with.
For an estimate, with our current user base it can have 2 Million dead tuples on daily basis. So, vacuuming of this table requires both:
- Read IO: as the whole table is not present in shared memory.
- Write IO: as there are a lot of entries to update.
What should be an ideal way to vacuum this table? Should I increase the autovacuum_cost_limit
to allow more operations per autovacuum run? But as i can see, it will increase IOPS
, which again might hinder the performance. Currently, I have autovacuum_scale_factor = 0.2
. Should I decrease it? If I decrease it it will run more often, although write IO will decrease but it will lead to more number of time period with high read IO.
Also, as the user base will increase it will take more and more time as the size of table with increase and vacuum will have to read a lot from disk. So, what should I do?
One of the solution I have thought of:
- Separate the highly updated column and make a separate table.
- Tweaking the parameter to make it run more often to decrease write IO(as discussed above). How to handle more Read IO, as vacuum will now run more often?
- Combine point 2 along with increasing RAM to reduce Read IO as well.
In general what is the approach that people takes, because I assume people must have very big table 10GB or more, that needs to be vacuumed.