48

I have a 9 million row table. I figured out that a large amount of it (around 90%) can be freed up. What actions are needed after the cleanup? Vacuum, reindex etc.

Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
Milan Dobrota
  • 489
  • 1
  • 4
  • 4

4 Answers4

63

If you want to free up space on the file system, either VACUUM FULL or CLUSTER can help you. You may also want to run ANALYZE after these, to make sure the planner has up-to-date statistics but this is not specifically required.

It is important to note using VACUUM FULL places an ACCESS EXCLUSIVE lock on your table(s) (blocking any operation, writes & reads), so you probably want to take your application offline for the duration.

In PostgreSQL 8.2 and earlier, VACUUM FULL is probably your best bet.

In PostgreSQL 8.3 and 8.4, the CLUSTER command was significantly improved, so VACUUM FULL is not recommended -- it's slow and it will bloat your indexes. `CLUSTER will re-create indexes from scratch and without the bloat. In my experience, it's usually much faster too. CLUSTER will also sort the whole physical table using an index, so you must pick an index. If you don't know which, the primary key will work fine.

In PostgreSQL 9.0, VACUUM FULL was changed to work like CLUSTER, so both are good.

It's hard to make predictions, but on a properly tuned server with commodity hardware, 9 million rows shouldn't take longer than 20 minutes.

Ermiya Eskandary
  • 15,323
  • 3
  • 31
  • 44
intgr
  • 19,834
  • 5
  • 59
  • 69
  • Can we do this when we have less free space available? Lets say is 9GB space been used from 10GB partition. Only less than 1GB free space available. Can we simply delete records and run VACUUM FULL? – Yasiru G Jan 29 '21 at 03:37
  • 2
    @YasiruG Unfortunately these operations create a full new copy of the table, so if the table (excluding deleted rows) is larger than 1GB, that would fail. You could drop & re-create indexes to temporarily free up space. Another option is to do pg_dump of this table to a separate machine or disk, and then drop table and restore it -- this would free up space just as well as VACUUM. – intgr Jan 31 '21 at 10:56
30

You definitely want to run a VACUUM, to free up that space for future inserts. If you want to actually reclaim that space on disk, making it available to the OS, you'll need to run VACUUM FULL. Keep in mind that VACUUM can run concurrently, but VACUUM FULL requires an exclusive lock on the table.

You will also want to REINDEX, since the indexes will remain bloated even after the VACUUM runs. If possible, a much faster way to do this is to drop the index and create it again from scratch.

You'll also want to ANALYZE, which you can just combine with the VACUUM.

See the documentation for more info.

DNS
  • 37,249
  • 18
  • 95
  • 132
  • was wondering - why dropping and recreating the index is much faster? From what I see (maybe it's not exactly right) when you reindex in creates a new one and replaces the old, shouldn't be much faster no? – Dmitry Polovinkin Jun 02 '23 at 19:32
4

Hi Don't it be more optimal to create a temporary table with 10% of needed records. Then drop original table and rename temporary to original ...

walla
  • 1,087
  • 8
  • 9
1

I'm relatively new to the world of Postgres, but I understand VACUUM ANALYZE is recommended. I think there's also a sub-option which just frees up space. I found reindex useful as well when doing batch inserts or deletes. Yes I've been working with tables with a similar number of rows, and the speed increase is very noticeable (UBuntu, Core 2 Quad)

winwaed
  • 7,645
  • 6
  • 36
  • 81