0

I have a question about the number of deadtuples after an update scenario in PGSQL.

Due to MVCC protocol in PGSQL, many updates cause the tables to bloat. We may track this with the number of dead tuples in the table. In the following scenario, I can't give any meaning the number of dead tuples in the following scenario.

I am updating "pgbench_accounts" table with the following testcase:

First, I turnedd off autovacuum in the system beforehand. ...=alter system set autovacuum to off; ...=# select pg_reload_conf();

Then I load the pgbench tables. > pgbench -i -- Now, pgbench_accounts has 100.000 tuples

Then I am updating randomly generated tuples as below: >pgbench -t10000 -fupdate-only -n

   update-only tx is as the following:
   \set aid random(1, 100000 * :scale)
   \set delta random(-5000, 5000)
   BEGIN;
   UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
   END;

As you guess, I want to update 10.000 tuples with a serial of simple update tx as shown above.

After the process end, I am seeing: 9543 updated tuples. That seems OK, because some tuples are updated twice or more. The problem is I am seeing only 3277 dead tuples.

Although I turned off autovacuum, some dead tuples disappear. how does this happen? I am expecting to see, at least the same number of updated tuples?

1 Answers1

1

It could be that not all updates were counted (statistics are not necessarily 100% accurate), but more likely the difference can be explained with HOT updates.

If there is still room in the block, and you don't update an indexed column, PostgreSQL can use a HOT update that doesn't require modifying the index. Dead tuples from HOT updates don't need VACUUM for cleanup. Any SELECT or other statement (in your case: UPDATE) can grab a brief lock on the page and reorganize it, getting rid of dead tuples.

You can verify that with

SELECT n_tup_upd, n_tup_hot_upd
FROM pg_stat_all_tables
WHERE relname = 'pgbench_accounts';
Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
  • I agree that it is about HOT. However, pgbench_accounts table has fillfactor of 100 by default. I got the following output after 10000 updates: n_tup_upd | n_tup_hot_upd | n_dead_tup -----------+---------------+------------ 10000 | 8336 | 3274 I wonder how the system finds space for 8336 HOT tuples in full pages? – utku kalay Jun 20 '23 at 13:54
  • If `fillfactor` is 100, the first updates cannot be HOT. But soon there will be enough blocks that are not filled all the way. If your rows are short, it is easier to get a HOT update. – Laurenz Albe Jun 20 '23 at 14:08