2

I have two servers, a master and a replica that work together in asynchronous replication mode; slave server seems to be working fine, since any change in the master is mirrored in the slave right away. Moreover, there is also an archive process that copy the wal files from the master to another filesystem to keep them safety. My doubt is, what wal files can I delete by means of pg_archivecleanup? I guess that I need look for wal files with .backup extension in the master and then I could delete wal files older than last backup.

For instance, if I have these files in the master server

000000010000000000000089
000000010000000000000088.00000028.backup
000000010000000000000088
000000010000000000000087
000000010000000000000086
...

I come to the conclusion that is safe deleting 000000010000000000000088 and older files, and keep the newest ones.

The problem is that I don't find .backup files anywhere: neither in the master, nor in the replica, nor in the archive location.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
sergi_bcn
  • 21
  • 1
  • The point of "pg_archivecleanup" is that it figures out for itself what it is safe to delete, based on the %r substitution. Are you trying to use pg_archivecleanup, or re-implement it? – jjanes Sep 10 '19 at 20:06
  • I was trying to figure out what wal files may be deleted by means of pg_archivecleanup and I was thinking that new wal backup files are generated as long as the streaming replication works, but now I realize that backup files are only generated by pg_basebackup. – sergi_bcn Sep 10 '19 at 23:25

2 Answers2

2

The *.backup files are created if you perform an online backup with pg_basebackup or similar and you are archiving WAL files using the archive_command.

In that case you can use pg_archivecleanup with such a file as argument to automatically remove all WAL archives that are older than that backup.

Your either use a different backup method (pg_dump?), or you are not using archive_command.

With pg_dump you cannot do archive recovery, so you wouldn't need to archive WALs at all. If you are using a different archive method like pg_receivewal, you won't get the *.backup files and you have to think of a different method to remove your old WAL archives.

One simple method to purge your old WAL archives is to simply remove all those that are older than your retention time.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
  • I'm using streaming replication and pg_basebackup was performed when the slave started up for the first time. So, according to you words, wal backup files are generated only the very first time pg_basebackup is performed, and no more backup files are being generated as part of the streaming replication process. I realize that I misandertood this point: I was thinking that wal backup files keep generating as long as the slave is replicating the master. Thanks for the feedback. – sergi_bcn Sep 10 '19 at 23:20
1

The files are still being generated and archived (unless you turned that off) an the master. They are also passed by streaming to the replica where they are kept in pg_wal, but the replica automatically cleans them up every restartpoint. You can get the replica to keep them permanently by setting archive_mode=always on the replica, but it sounds like you don't want that.

If the only purpose of the archive (by the master) is to save files for use by the replica in case it falls to far behind for streaming (not for disaster recovery or PITR) than you can use "pg_archivecleanup" to automatically clean them up. This is invoked on the replica (not the master) but it must have write access to the archive directory. So you can mount it as a network file share, you can wrap pg_archivecleanup in ssh so it gets run on the master rather than the replica.

jjanes
  • 37,812
  • 5
  • 27
  • 34