I have PgSql 13.7 at Amazon RDS, this is production, so can't stop everything there. I run into strange behaviour with dead tuples, there is a lot of bloat in system tables. For example:
pg_attribute: Live tuples=823, Dead tuples=750939
pg_statistic: Live=776 Dead=528366
pg_depend: Live=525 Dead=219947
Running vacuum on that tables doesn't clean anything:
> SELECT txid_current();
38159688
> VACUUM FULL VERBOSE pg_statistic;
INFO: vacuuming “pg_catalog.pg_statistic”
INFO: “pg_statistic”: found 0 removable, 527834 nonremovable row versions in 37988 pages
DETAIL: 527058 dead row versions cannot be removed yet, oldest xmin: 37231295
Same with autovacuum, many-many dead row versions, that cannot be removed yet.
I've checked, I have no long running transactions, all of them finishes in 5-10 seconds. A have several replication slots, but none of them are holding transaction number (xmin=null for evety slot) There is no long living locks in pg_locks also. And also I see many pg_toast_* and pg_temp_* tables. My best guess, that they cannot be cleared until that dead tuples in system tables are exist.
Any ideas, what next step should I do find a root of a problem?
I did manual VACUUM, I've looked for locks etc