0

PostgreSQL 9.5.2 RDS in AWS

select name,setting from pg_settings 
where name like '%vacuum%' 
order by name;
                name                 |  setting
-------------------------------------+-----------
 autovacuum                          | on
 autovacuum_analyze_scale_factor     | 0.05
 autovacuum_analyze_threshold        | 50
 autovacuum_freeze_max_age           | 450000000
 autovacuum_max_workers              | 3
 autovacuum_multixact_freeze_max_age | 400000000
 autovacuum_naptime                  | 30
 autovacuum_vacuum_cost_delay        | 20
 autovacuum_vacuum_cost_limit        | -1
 autovacuum_vacuum_scale_factor      | 0.1
 autovacuum_vacuum_threshold         | 50
 autovacuum_work_mem                 | -1
 log_autovacuum_min_duration         | 0
 rds.force_autovacuum_logging_level  | log
 vacuum_cost_delay                   | 0
 vacuum_cost_limit                   | 300
 vacuum_cost_page_dirty              | 20
 vacuum_cost_page_hit                | 1
 vacuum_cost_page_miss               | 10
 vacuum_defer_cleanup_age            | 0
 vacuum_freeze_min_age               | 50000000
 vacuum_freeze_table_age             | 250000000
 vacuum_multixact_freeze_min_age     | 5000000
 vacuum_multixact_freeze_table_age   | 150000000

I've been trying to figure out how auto vacuuming is working in two Postgres databases. The databases are identical in size, parameters and structure. (These are two data warehouses for the same application - different locations and different patterns of data).

We are using partitions for some of our very large tables. I've noticed that the older (static) partitions are regularly getting auto vacuumed. I understand that XIDs are frozen but the relation does need periodic vacuuming to look for and new XIDs.

I've been using this query to look for relations that will require vacuuming to avoid XID wrap around:

SELECT 'Relation Name',age(c.relfrozenxid) c_age, age(t.relfrozenxid) t_age,
       greatest(age(c.relfrozenxid),age(t.relfrozenxid)) as age
FROM pg_class c
LEFT JOIN pg_class t ON c.reltoastrelid = t.oid
WHERE c.relkind IN ('r', 'm')
order by age desc limit 5;

   ?column?    |   c_age   |   t_age   |    age
---------------+-----------+-----------+-----------
 Relation Name | 461544753 |           | 461544753
 Relation Name | 461544753 |           | 461544753
 Relation Name | 461544753 |           | 461544753
 Relation Name | 461544753 |           | 461544753
 Relation Name | 461544753 | 310800517 | 461544753

All of the relations listed are old stable partitions. The column relfrozenxid is defined as: "All transaction IDs before this one have been replaced with a permanent ("frozen") transaction ID in this table. This is used to track whether the table needs to be vacuumed in order to prevent transaction ID wraparound or to allow pg_clog to be shrunk."

Out of curiosity I looked at relfrozenxid for all of the partitions of a particular table:

SELECT c.oid::regclass as table_name,age(c.relfrozenxid) as age , c.reltuples::int, n_live_tup, n_dead_tup,
         date_trunc('day',last_autovacuum)
FROM pg_class c
JOIN pg_stat_user_tables u on c.relname = u.relname
WHERE c.relkind IN ('r', 'm')
and  c.relname like 'tablename%'

      table_name                     |    age    | reltuples | n_live_tup | n_dead_tup |       date_trunc
-------------------------------------+-----------+-----------+------------+------------+------------------------
 schema_partition.tablename_201202   | 460250527 |         0 |          0 |          0 | 2018-09-23 00:00:00+00
 schema_partition.tablename_201306   | 460250527 |         0 |          0 |          0 | 2018-09-23 00:00:00+00
 schema_partition.tablename_201204   | 460250527 |         0 |          0 |          0 | 2018-09-23 00:00:00+00
 schema_partition.tablename_201110   | 460250527 |         0 |          0 |          0 | 2018-09-23 00:00:00+00
 schema_partition.tablename_201111   | 460250527 |         0 |          0 |          0 | 2018-09-23 00:00:00+00
 schema_partition.tablename_201112   | 460250527 |         0 |          0 |          0 | 2018-09-23 00:00:00+00
 schema_partition.tablename_201201   | 460250527 |         0 |          0 |          0 | 2018-09-23 00:00:00+00
 schema_partition.tablename_201203   | 460250527 |         0 |          0 |          0 | 2018-09-23 00:00:00+00
 schema_partition.tablename_201109   | 460250527 |         0 |          0 |          0 | 2018-09-23 00:00:00+00
 schema_partition.tablename_201801   | 435086084 |  37970232 |   37970230 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201307   | 433975635 |         0 |          0 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201107   | 433975635 |         0 |          0 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201312   | 433975635 |         0 |          0 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201311   | 433975635 |         0 |          0 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201401   | 433975635 |         0 |          0 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201310   | 423675180 |         0 |          0 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201704   | 423222113 |  43842668 |   43842669 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201612   | 423222113 |  65700844 |   65700845 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201705   | 423221655 |  46847336 |   46847338 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201702   | 423171142 |  50701032 |   50701031 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_overflow | 423171142 |       754 |        769 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201106   | 421207271 |         1 |          1 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201309   | 421207271 |         0 |          0 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201108   | 421207271 |         0 |          0 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201308   | 421207271 |         0 |          0 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201806   | 374122782 |  44626756 |   44626757 |          0 | 2018-09-26 00:00:00+00
 schema.tablename                    | 360135561 |         0 |          0 |          0 | 2018-09-27 00:00:00+00

I'm pretty sure I don't really understand how the relfrozenxid works but it does appear that the partition tables are affected by the parent table (which would affect the relfrozenxid value for the partitioned table). I can't find any documentation regarding this. I would think that for static tables the relfrozenxid would remain static until a vacuum occurred.

Additionally I have a handful of relations that have static data that apparently have never been auto vacuumed (last_autovacuum is null). Could this be a result of a VACUUM FREEZE operation?

I am new to Postgres and I readily admit to not fully understanding the auto vacuum processes.

I'm not seeing and performance problems that I can identify.

Edit:

I set up a query to run every 4 hours against one partitioned table:

SELECT c.oid::regclass as table_name,age(c.relfrozenxid) as age , c.reltuples::int, n_live_tup, n_dead_tup,
         date_trunc('day',last_autovacuum)
FROM pg_class c
JOIN pg_stat_user_tables u on c.relname = u.relname
WHERE c.relkind IN ('r', 'm')
and  c.relname like 'sometable%'
order by age desc;

Looking at two different partitions here is the output for the last 20 hours:

 schemaname.sometable_201812   | 206286536 |         0 |          0 |          0 |
 schemaname.sometable_201812   | 206286537 |         0 |          0 |          0 |
 schemaname.sometable_201812   | 225465100 |         0 |          0 |          0 |
 schemaname.sometable_201812   | 225465162 |         0 |          0 |          0 |
 schemaname.sometable_201812   | 225465342 |         0 |          0 |          0 |
 schemaname.sometable_201812   | 236408374 |         0 |          0 |          0 |
-bash-4.2$  grep 201610 test1.out
 schemaname.sometable_201610   | 449974426 |  31348368 |   31348369 |          0 | 2018-09-22 00:00:00+00
 schemaname.sometable_201610   | 449974427 |  31348368 |   31348369 |          0 | 2018-09-22 00:00:00+00
 schemaname.sometable_201610   | 469152990 |  31348368 |   31348369 |          0 | 2018-09-22 00:00:00+00
 schemaname.sometable_201610   |  50000051 |  31348368 |   31348369 |          0 | 2018-10-10 00:00:00+00
 schemaname.sometable_201610   |  50000231 |  31348368 |   31348369 |          0 | 2018-10-10 00:00:00+00
 schemaname.sometable_201610   |  60943263 |  31348368 |   31348369 |          0 | 2018-10-10 00:00:00+00

The relfrozenxid of partitions is being modified even though there is no direct DML to the partition. I would assume that inserts to the base table are somehow modifying the relfrozenxid of the partitions.

The partition sometable_201610 has 31 million rows but is static. When I look at the log files the autvacuum of this type of partition is taking 20-30 minutes. I don't know if that is a performance problem or not but it does seem expensive. Looking at the autovacuum in the log files shows that typically there are several of these large partitions autovacuumed every night. (There are also lots of the partitions with zero tuples that are autovacuumed but these take very little time).

  • Yes, if you run `VACUUM` manually, autovacuum doesn't have to run. Apart from that, I don't understand what your problem/question is. – Laurenz Albe Oct 09 '18 at 18:17
  • I'm looking for some kind of clarification on why the relfrozenxid increases on static partitions and if there is a way to freeze the partitions to avoid autovacuum. – Evelyn Dibben Oct 09 '18 at 20:18
  • 1
    I'd say that autovacuum still runs, but does nothing except increase the `relfrozenxid`. Nothing to worry about. – Laurenz Albe Oct 09 '18 at 21:46
  • Thanks for the replies. Overnight I set up a query to run every 4 hours to collect information about 1 particular partitioned table: – Evelyn Dibben Oct 10 '18 at 13:34
  • This there any update on this? We are also facing thing, but in our case autovacuum is taking time on older partition, it is not instant. – Vishwesh Jainkuniya Feb 04 '20 at 15:32
  • I haven't done any more work on this issue. For me it was a case of if it ain't really broke don't fix it. – Evelyn Dibben Feb 21 '20 at 17:58

0 Answers0