0

I have a large database with the largest tables having more than 30 million records. The database server is a dedicated server with 64 cores, 128 GB RAM running ubuntu and postgres 12. So the server is more powerful than we normally need. The server receives around 300-400 new records every second.

The problem is that almost after 1 week or 10 days of use the database becomes extremely slow, therefore we have to perform VACUUM FULL ANALYZE, and after this everything goes back to normal. But we have to put our server in maintenance mode and then perform this operation every week which is a pain.

I came up with the idea that we don't need a VACUUM FULL and we can just run ANALYZE on the database as it can run in parallel, but this didn't work. There was no performance gains after running this. Even when i run simple VACUUM on the whole database and then run ANALYZE after it, it still doesn't give the kind of performance boost that we get from VACUUM FULL ANALYZE.

I know that VACUUM FULL copies the data from the old table to a new tables and deletes the old table. But what else does it do?

Update: So i have also reindexed the 15 largest tables, in order to confirm if this would speed up the database. But this also didnt work.

So i had to execute VACUUM FULL ANALYZE, as i didnt see any other way. Now i am trying to identify the slow queries.

Thanks to jjanes, i was able to install Track_io_timing and also identified a few queries where indexes can be added. I am using like this

SELECT * FROM pg_stat_statements ORDER BY total_time DESC;

And i get this result.

userid              | 10
dbid                | 16401
queryid             | -3264485807545194012
query               | update events set field1 = $1, field2 = $2 , field3= $3, field4 = $4 , field5 =$5 where id = $6
calls               | 104559
total_time          | 106180828.60536088
min_time            | 3.326082
max_time            | 259055.09376800002
mean_time           | 1015.5111334783633
stddev_time         | 1665.0715182035976
rows                | 104559
shared_blks_hit     | 4456728574
shared_blks_read    | 4838722113
shared_blks_dirtied | 879809
shared_blks_written | 326809
local_blks_hit      | 0
local_blks_read     | 0
local_blks_dirtied  | 0
local_blks_written  | 0
temp_blks_read      | 0
temp_blks_written   | 0
blk_read_time       | 15074237.05887792
blk_write_time      | 15691.634870000113

This query simply updates 1 record, and the table size is around 30 Million records.

Question: This query already uses an index, can you please guide on what should be the next step and why is this slow? Also IO information does this show?

Omer Farooq
  • 3,754
  • 6
  • 31
  • 60
  • Turn on track_io_timing, then find a slow query and look at EXPLAIN (ANALYZE, BUFFERS) before and after – jjanes Sep 21 '21 at 13:54
  • @jjanes The problem is that these queries have been working fine for a couple of years, and we have recently started to face these issues. We used to vaccum the db every 3-4 months and now this time has reduced to 1-2 weeks. – Omer Farooq Sep 21 '21 at 15:43
  • Regardless of how long it worked well for, the way to figure it out remains the same... – jjanes Sep 21 '21 at 16:28
  • Is there a way to enable this configuration without restarting the database? Any explanation on how to use it would be helpful. Thanks – Omer Farooq Sep 21 '21 at 16:34
  • That looks a lot like your tables get bloated. Perhaps you need to tune autovacuum, perhaps you need to change your table definitions or your workload. – Laurenz Albe Sep 21 '21 at 16:39
  • @LaurenzAlbe I also thought about tuning the autovacuum, but even when manual vacuum is executed there are not improvements. Does the vacuum full command reindexes the whole table? Because i have not yet tried it. – Omer Farooq Sep 21 '21 at 16:47
  • Yes, `VACUUM (FULL)` rebuilds all indexes. – Laurenz Albe Sep 21 '21 at 21:03
  • Track_io_timing just requires a reload not a restart. You can also change it just within a session, but I don't think there is any reason not to do it globally and permanently. – jjanes Sep 22 '21 at 04:05
  • @jjanes, i was able to configure Track_io_timing, and i have updated my question. There is this query that already uses an index and is very simple but performs slowly. – Omer Farooq Sep 22 '21 at 16:41
  • But we still need to see `EXPLAIN (ANALYZE, BUFFERS)` of the query. – jjanes Sep 25 '21 at 17:05

1 Answers1

0

As you say, VACUUM FULL is an expensive command. PGs secret weapon is AUTOVACUUM, which monitors database stats and attempts to target tables with dead tuples. Read about how to tune it for the database as a whole, and possibly for big tables.

Tom Harrison
  • 13,533
  • 3
  • 49
  • 77