1

I have a Postgresql DB with a total size of 1.7TB. Out of which 1.6TB is occupied by one table. The total disk space was 1.7TB so I am now out of Disk space.

I cannot run Vacuum on it, I tried truncating few tables and then used the available space to run Vacuum o the other tables and was able to get some space. But the problem is that I cannot run a Vacuum on the table which is occupying 99% of the space.

So I tried to take the backup so that I can truncate this table. The backup is running and the size of the backup has grown past 3TB and still growing.

I am new to PostgreSQL and have no idea how to handle this situation.

I have a temporary 4TB space that I am using to take the backup but I am afraid this backup is going to grow more than the available space.

Any help/guidance will be highly appreciated.

Fahad Nisar
  • 1,723
  • 12
  • 18
  • About [`vacuum`](https://www.postgresql.org/docs/current/sql-vacuum.html): "_... extra space is not returned to the operating system ..._" To make your dump smaller use `--format` option with `c` or `d` key of the [`pg_dump`](https://www.postgresql.org/docs/current/app-pgdump.html) utility – Abelisto Nov 06 '20 at 06:20
  • @Abelisto it took me 2 days to dump it this far. I am also tight on the time side. Any suggestions on how to get out of this situation without spending a lot more time? – Fahad Nisar Nov 06 '20 at 06:23
  • The "brute-force" way: drop unused or rarely used indexes on this huge table to free some space. Then you will have a time to analyze the situation and find the solution. BTW such questions should be asked [here](https://dba.stackexchange.com) – Abelisto Nov 06 '20 at 06:30
  • Noted. Let me ask it there. – Fahad Nisar Nov 06 '20 at 06:33
  • If it is on Linux there are possible tricks like compressing like each 1GB chunk of the output file separately and then "poking a hole" in it with "fallocate --punch-hole". Also if pg_dump of 2TB is taking 2 days then maybe your pg_dump process is too far away network wise from the database - it's better to run pg_dump as close as possible and then only copy the compressed output between networks. – Tometzky Nov 06 '20 at 07:31

2 Answers2

1

Use

pg_dump -F c -Z 9 -f dumpfile -t largetable dbname

That will be slow, but compresses the dump as much as possible.

It is perfectly normal for a plain text dump to be larger than the table. For example, an integer occupies 4 bytes and a timestamp 8 bytes, but their textual representation can be bigger. Also, larger column values get compressed on disk.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
0

If you have no option of adding more storage, then you could perhaps handle the situation with gradual migration of the rows into a new table. Consider partitioning the table if isn't partitioned already.

The best way to deal with your situation is to add a storage device, even temporarily.

Regarding the difference of size between your table and the backup, I can think of two possible reasons:

  1. You're not computing the correct table size. Perhaps there is a lot of data in the toast table.

  2. You have lots of binary data and are backing up to a textual case. If that is the case then each byte of binary data gets expanded to two hexadecimal characters.

Jonathan Jacobson
  • 1,441
  • 11
  • 20