1

Postgres doc tells that partitioned tables are not processed by autovacuum. But still I see that last_autovacuum column from pg_stat_user_tables is populated with recent timestamps for live partitions.

Does it mean that these timestamps are set by the background worker which only prevents transaction ID wraparound, without actually performing ANALYZE&VACUUM? Or whatever else could populate them?

Besides, taken that partitions are large and active enough, should I run the both ANALYZE and VACUUM manually on those partitions? If yes, does the order matter?

UPDATE

I'm trying to elaborate, thanks to the comments given.

  1. Taking that vacuum should work the same way on partition as on the regular table, what could be a reason for much faster growth of the occupied disk space after partitioning? Before partitioning it was nearly a linear function of records count.

  2. What is confusing as well, when looking for autovacuum processes running I see that those related to partitions are denoted with "to prevent wraparound", while others are not. Is it absolutely a coincidence or there is something to check?

  3. Documentation describes partitioned table as rather a virtual entity, without its own storage. What is the point in denoting that it is not vacuumed?

user3714601
  • 1,156
  • 9
  • 27
  • Where did you find that information? Every table will be vacuumed when there is a need for – Frank Heikens Oct 04 '22 at 17:54
  • 1
    @FrankHeikens, right here: https://www.postgresql.org/docs/current/routine-vacuuming.html Partitioned tables are not processed by autovacuum. Statistics should be collected by running a manual ANALYZE when it is first populated, and again whenever the distribution of data in its partitions changes significantly. – user3714601 Oct 04 '22 at 17:59
  • 1
    The partitions are vacuumed and analyzed as usual. It is the parent table alone that is not. – jjanes Oct 04 '22 at 18:10
  • A partition is vacuumed when data has changed, when there is a need for. It’s not done because of a vacuum on the parent. These are different tables. And yes, one could be a parent and the other a child – Frank Heikens Oct 04 '22 at 18:15
  • @FrankHeikens , thanks a lot for your comments. May I ask you to take a look at the update above? – user3714601 Oct 04 '22 at 18:50
  • @jjanes thanks a lot for your comments. May I ask you to take a look at the update above? – user3714601 Oct 04 '22 at 18:51
  • For number 2 we need way more information, and number 3 is the opposite of what I can find in the documentation – Frank Heikens Oct 04 '22 at 19:00
  • @FrankHeikens, a quote from the docs (https://www.postgresql.org/docs/current/ddl-partitioning.html): The partitioned table itself is a “virtual” table having no storage of its own. Instead, the storage belongs to partitions, which are otherwise-ordinary tables associated with the partitioned table. – user3714601 Oct 04 '22 at 19:23
  • The parent table is empty, is virtual. But still needs a vacuum once and a while to avoid the wrap around. auto vacuum takes care of this. The partitions will be vacuumed more often, when data has changed. But, what problem do you try to solve? – Frank Heikens Oct 04 '22 at 19:57
  • @FrankHeikens, the (1), much faster growth of the disk space occupied, compared to the non-partitioned version. – user3714601 Oct 04 '22 at 20:46
  • Then you should ask a question about that issue and give us some information – Frank Heikens Oct 04 '22 at 20:50
  • Which partition is growing? All of them? Use [pgstattuple](https://www.postgresql.org/docs/current/pgstattuple.html) to see if there is a lot of dead space or free space. Maybe someone adopted partitioning in anticipation of a major increase in activity, and now that increase has happened. – jjanes Oct 04 '22 at 22:40

1 Answers1

2

The statement from the documentation is true, but misleading. Autovacuum does not process the partitioned table itself, but it processes the partitions, which are regular PostgreSQL tables. So dead tuples get removed, the visibility map gets updated, and so on. In short, there is nothing to worry about as far as vacuuming is concerned. Remember that the partitioned table itself does not hold any data!

What the documentation warns you about is ANALYZE. Autovacuum also launches automatic ANALYZE jobs to collect accurate table statistics. This will be work fine on the partitions, but there are no table statistics collected on the partitioned table itself, so you have to run ANALYZE manually on the partitioned table to get these data. In practice, I find that not to be a problem, since the optimizer generates plans for each individual partition anyway, and there it has accurate statistics.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
  • Thank you! Is there a chance to correct/complete that part of the doc? I'm quite sure that all of my colleagues have read it the same way, i.e. wrong. – user3714601 Oct 05 '22 at 07:10
  • 2
    I suggested a [documentation patch](https://postgr.es/m/1fd81ddc7710a154834030133c6fea41e55c8efb.camel%40cybertec.at). – Laurenz Albe Oct 05 '22 at 08:39