-1

Apologies, This is long post and question & postgres newbie

Wanted to know how postgres VACCUM and VACCUM FULL works, which is preferable for 700 GB Table archival without downtime.

VACCUM FULL CASE:

In this case looks like the memory is being released to OS after the vacuum full command is executed, but the n_dead_tuples remain same size.

Does this n_dead_tup will be reused in next insert / update / delete ?

STEPS:

CREATE TABLE Foo (
  id SERIAL PRIMARY KEY,
  x  INTEGER NOT NULL
);

// CREATE TABLE

INSERT INTO Foo (x) VALUES (generate_series(1,1000000));

// INSERT 0 1000000

SELECT pg_size_pretty(pg_relation_size('foo'));

// 35 MB

SELECT n_dead_tup FROM pg_stat_user_tables where relname = 'foo';

// 0

UPDATE Foo SET x = x + 1;

// UPDATE 1000000

SELECT pg_size_pretty(pg_relation_size('foo'));

// 69 MB

SELECT n_dead_tup FROM pg_stat_user_tables where relname = 'foo';

// 1000000

VACUUM FULL VERBOSE foo;

// INFO:  vacuuming "public.foo"
// INFO:  "foo": found 1000000 removable, 1000000 nonremovable row versions in 8850 pages
// DETAIL:  0 dead row versions cannot be removed yet.
// CPU 0.07s/0.67u sec elapsed 0.76 sec.
// VACUUM

 SELECT pg_size_pretty(pg_relation_size('foo'));

// 35 MB

SELECT n_dead_tup FROM pg_stat_user_tables where relname = 'foo';

// 1000000

VACCUM CASE

STEPS:

Same as above steps except VACCUM FULL, need to the run the below commands, In this case looks like dead tuples are removed but the size is not released to OS.

VACUUM VERBOSE foo;

// INFO:  vacuuming "public.foo"
// INFO:  scanned index "foo_pkey" to remove 1000000 row versions
// DETAIL:  CPU 0.01s/0.27u sec elapsed 0.34 sec.
// INFO:  "foo": removed 1000000 row versions in 4425 pages
// DETAIL:  CPU 0.00s/0.02u sec elapsed 0.02 sec.
// INFO:  index "foo_pkey" now contains 1000000 row versions in 8237 pages
// DETAIL:  1000000 index row versions were removed.
// 0 index pages have been deleted, 0 are currently reusable.
// CPU 0.00s/0.00u sec elapsed 0.00 sec.
// INFO:  "foo": found 1000000 removable, 1000000 nonremovable row versions in 8850 out of 8850 pages
// DETAIL:  0 dead row versions cannot be removed yet.
// There were 0 unused item pointers.
// Skipped 0 pages due to buffer pins.
// 0 pages are entirely empty.
// CPU 0.03s/0.52u sec elapsed 0.61 sec.
// VACUUM

SELECT pg_size_pretty(pg_relation_size('foo'));

// 69 MB

SELECT n_dead_tup FROM pg_stat_user_tables where relname = 'foo';

// 0

IS AUTO VACUUM EQUIVALENT TO VACUUM ?

Thanks in advance.

Sathish
  • 2,056
  • 3
  • 26
  • 40

1 Answers1

1

A simple VACUUM usually does not free disk space as explained in the documentation:

[...] However, extra space is not returned to the operating system (in most cases); it's just kept available for re-use within the same table. [...]

For VACUUM FULL it is different because it basically rewrites the whole table data as explained in the documentation:

[...] This method also requires extra disk space, since it writes a new copy of the table and doesn't release the old copy until the operation is complete. [...]

With the rewriting, the data gets written physically close together, without holes induced by older row versions. The growing disk space required is one of the downsides of the MVCC implemented by PostgreSQL and needs careful observation and tuning for big and update-heavy tables.

The auto-vacuum will never execute a VACUUM FULL as it would block any other activity on the table/relation it is working at with unpredictable consequences for any connected application.

If you run into issues here you should start thinking at least about partitioning the table in a way that either the impact of a maintenance VACUUM FULL remains low (because the partition is small in size) or even better, enables you to just drop a partition instead of requiring rewrites. But that really depends on the semantics of the data you are storing and updating.

Ancoron
  • 2,447
  • 1
  • 9
  • 21