I have a Postgres cluster with 3 nodes: ETCD+Patroni+Postgres13.
Now there was a problem of constantly growing pg_wal
folder. It now contains 5127 files. After searching the internet, I found an article advising you to pay attention to the following database parameters (their meaning at the time of the case is this):
archive_mode off;
wal_level replica;
max_wal_size 1G;
SELECT * FROM pg_replication_slots;
postgres=# SELECT * FROM pg_replication_slots;
-[ RECORD 1 ]-------+------------
slot_name | db2
plugin |
slot_type | physical
datoid |
database |
temporary | f
active | t
active_pid | 2247228
xmin |
catalog_xmin |
restart_lsn | 2D/D0ADC308
confirmed_flush_lsn |
wal_status | reserved
safe_wal_size |
-[ RECORD 2 ]-------+------------
slot_name | db1
plugin |
slot_type | physical
datoid |
database |
temporary | f
active | t
active_pid | 2247227
xmin |
catalog_xmin |
restart_lsn | 2D/D0ADC308
confirmed_flush_lsn |
wal_status | reserved
safe_wal_size |
All other functionality of the Patroni cluster works (switchover, reinit, replication);
root@srvdb3:~# patronictl -c /etc/patroni/patroni.yml list
+ Cluster: mobile (7173650272103321745) --+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+--------+------------+---------+---------+----+-----------+
| db1 | 10.01.1.01 | Replica | running | 17 | 0 |
| db2 | 10.01.1.02 | Replica | running | 17 | 0 |
| db3 | 10.01.1.03 | Leader | running | 17 | |
+--------+------------+---------+---------+----+-----------+
Patroni patroni-edit:
loop_wait: 10
maximum_lag_on_failover: 1048576
postgresql:
parameters:
checkpoint_timeout: 30
hot_standby: 'on'
max_connections: '1100'
max_replication_slots: 5
max_wal_senders: 5
shared_buffers: 2048MB
wal_keep_segments: 5120
wal_level: replica
use_pg_rewind: true
use_slots: true
retry_timeout: 10
ttl: 100
Help please, what could be the matter?
This is what I see in pg_stat_archiver
:
postgres=# select * from pg_stat_archiver;
-[ RECORD 1 ]------+------------------------------
archived_count | 0
last_archived_wal |
last_archived_time |
failed_count | 0
last_failed_wal |
last_failed_time |
stats_reset | 2023-01-06 10:21:45.615312+00