2

Having READ COMMITTED isolation level, idle transactions that have performed a write operation will prevent vacuum to cleanup dead rows for the tables that transaction wrote in.

That is clear for tables that were written by transactions that are still in progress. Here you can find a good explanation.

But it is not clear to me why this limitation affects also to any other tables.

For example: transaction T is started and it updates table B, vacuum is executed for table A while T is in "idle in transaction" state. In this scenario, why dead rows in A cannot be removed?

Here what I did:

# show default_transaction_isolation;
 default_transaction_isolation 
-------------------------------
 read committed
(1 row)
# create table a (v int);
CREATE TABLE
# create table b (v int);
CREATE TABLE

# insert into a values (generate_series(1,1000));
INSERT 0 1000

At this point I do an update to generate new 1000 dead rows

# update a set v = v + 1;
UPDATE 1000

Vacuuming will remove them as expected:

# vacuum verbose a;
INFO:  vacuuming "public.a"
INFO:  "a": removed 1000 row versions in 5 pages
INFO:  "a": found 1000 removable, 1000 nonremovable row versions in 9 out of 9 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM

I now start transaction T writing in table b:

# begin;
BEGIN
# insert into b values (generate_series(1,1000));
INSERT 0 1000

I generate more dead rows again in a different transaction T1 that started after T:

# begin;
# update a set v = v + 1;
# commit;

In a different transaction:

# vacuum verbose a;
INFO:  vacuuming "public.a"
INFO:  "a": found 0 removable, 2000 nonremovable row versions in 9 out of 9 pages
DETAIL:  1000 dead row versions cannot be removed yet.
There were 34 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM

This is the relevant part: DETAIL: 1000 dead row versions cannot be removed yet.

If I commit transaction T and execute again vacuum I get dead rows removed as expected:

# vacuum verbose a;
INFO:  vacuuming "public.a"
INFO:  "a": removed 1000 row versions in 5 pages
INFO:  "a": found 1000 removable, 1000 nonremovable row versions in 9 out of 9 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 34 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM
alostale
  • 770
  • 1
  • 11
  • 21
  • no. transaction holds dirty pages. if you did not modify table b, your idel in transaction wont affect anyting on table b... I assume :) – Vao Tsun Sep 07 '17 at 15:35
  • @VaoTsun that's also what I assumed, but I wanted to ensure, cause it's not clear to me. – alostale Sep 07 '17 at 15:40
  • ok then. Having READ COMMITTED isolation level, idle transactions that have performed a write operation will NOT prevent autovacuum to cleanup dead rows for the tables that transaction DID NOT write in. I cant find that clearly stated in docs, but It feels natural with Postgres MVCC – Vao Tsun Sep 07 '17 at 15:44
  • Note that a transaction will only keep vacuum from cleaning up new dead tuples since the transaction started. Dead tuples created before that transaction started are full capable of being harvested and reused. – Scott Marlowe Sep 07 '17 at 18:31
  • @VaoTsun I did a simple test case (described above) and it looks like. Writes in **any** table prevent autovacuum in **all** the tables. – alostale Sep 08 '17 at 05:48
  • @ScottMarlowe it is clear it needs to prevent removing old dead rows that could still be seen by ongoing transactations, and that's ok for those tables written by those ongoing transactions. The question is why no table can be vacuumed if there is any ongoing transaction that did any write. – alostale Sep 08 '17 at 05:52
  • @alostale: only in the tables that are used by that transaction. The transaction might still need the old row version(s). If a transaction never uses a table, that table **will** be vacuumed –  Sep 08 '17 at 05:58
  • @a_horse_with_no_name that's clear and it's what I initially assumed. But if my test case is correct (see above), it looks that **all** tables are affected as soon as there is a idle transaction that wrote in **any** table. – alostale Sep 08 '17 at 06:01
  • @alostale I liked your approach so decided to reproduce surprising phenomena - of course I can't reproduce it. `INFO: "a": found 0 removable, 0 nonremovable row versions in 0 out of 9 pages` please share your version and create a scenarion in rexter or sql fiddle (sorry - not sure about site names - anywhere in shared env) – Vao Tsun Sep 08 '17 at 07:29
  • @VaoTsun: I'm using `PostgreSQL 9.4.8 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 5.2.1-22ubuntu2) 5.2.1 20151010, 64-bit` – alostale Sep 08 '17 at 07:34
  • @VaoTsun: sorry but I'm not able to create the fiddle because it looks it does not allow to [execute vacuum](http://sqlfiddle.com/#!17/6e12e/3) – alostale Sep 08 '17 at 07:42
  • @alostale yes and share db in different transanctions wont do either I suppose – Vao Tsun Sep 08 '17 at 07:51
  • But that was my point, a table CAN be vacuumed that has a current transaction accessing it. BUT the tuples updated by that transaction and any that started after it can't be harvested while the transaction is in progress. – Scott Marlowe Sep 08 '17 at 16:15

3 Answers3

1

Following this question up via Twitter.

Current (at least up to PostgreSQL 9.6) behavior is:

Live transactions performing a write operation in any table will prevent vacuuming dead rows generated by commited transactions that started after first live transaction in any other table.

Even this limitation is not required from the conceptual point of view, it is how current algorithm is implemented for performance on checking dead rows reasons.

alostale
  • 770
  • 1
  • 11
  • 21
  • When you say "write operation in any table will prevent vacuuming dead rows....", is it related to any table in that specific schema or any table in database. – Nik Aug 29 '20 at 14:23
0

cant reproduce:

first session script:

-bash-4.2$ cat prim.sql
create table a (v int);
create table b (v int);
insert into a values (generate_series(1,1000));
update a set v = v + 1;
vacuum verbose a;
begin;
  insert into b values (generate_series(1,1000));
  select pg_sleep(9);
  select e'I\'m still open transaction'::text prim;

second session and check state:

-bash-4.2$ cat 1.sh
(sleep 3; psql t -c "vacuum verbose a;") &
(sleep 5; psql t -c "select state,query from pg_stat_activity where state != 'idle' and pid <> pg_backend_pid()") &
psql t -f prim.sql

and run:

-bash-4.2$ bash 1.sh
CREATE TABLE
CREATE TABLE
INSERT 0 1000
UPDATE 1000
psql:prim.sql:5: INFO:  vacuuming "public.a"
psql:prim.sql:5: INFO:  "a": removed 1000 row versions in 5 pages
psql:prim.sql:5: INFO:  "a": found 1000 removable, 1000 nonremovable row versions in 9 out of 9 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM
BEGIN
INSERT 0 1000
INFO:  vacuuming "public.a"
INFO:  "a": found 0 removable, 1000 nonremovable row versions in 9 out of 9 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 1000 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM
 state  |        query
--------+---------------------
 active | select pg_sleep(9);
(1 row)

 pg_sleep
----------

(1 row)

            prim
----------------------------
 I'm still open transaction
(1 row)

As you can see the first session was active before, while and after the vacuum in different session took place.

the version I tried at is:

t=# select version();
                                                   version
--------------------------------------------------------------------------------------------------------------
 PostgreSQL 9.3.14 on x86_64-redhat-linux-gnu, compiled by gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-9), 64-bit
(1 row)
Vao Tsun
  • 47,234
  • 13
  • 100
  • 132
0

It is important to generate the dead rows again in a transaction started AFTER the transaction that remains open.

I have been able to reproduce the problem with the following versions:

  • PostgreSQL 9.3.19 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4, 64-bit

  • PostgreSQL 9.5.9 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4, 64-bit'

caristu
  • 115
  • 1
  • 5