76

I am currently using pg_dump piped to gzip piped to split. But the problem with this is that all output files are always changed. So checksum-based backup always copies all data.

Are there any other good ways to perform an incremental backup of a PostgreSQL database, where a full database can be restored from the backup data?

For instance, if pg_dump could make everything absolutely ordered, so all changes are applied only at the end of the dump, or similar.

Asclepius
  • 57,944
  • 17
  • 167
  • 143
Dennis Thrysøe
  • 1,791
  • 4
  • 19
  • 31
  • 4
    Have you found the solution? I also have the same requirement i.e incremental back up in PostgreSQL. I have gone through many articles and websites but i could not found a clear cut way to do incremental backup in PostgreSQL. Do PostgreSQL support incremental backup independently without third party tools like pg rman? Please help me on this. Thanks. – Suniel Jul 03 '16 at 17:13

3 Answers3

74

Update: Check out Barman for an easier way to set up WAL archiving for backup.

You can use PostgreSQL's continuous WAL archiving method. First you need to set wal_level=archive, then do a full filesystem-level backup (between issuing pg_start_backup() and pg_stop_backup() commands) and then just copy over newer WAL files by configuring the archive_command option.

Advantages:

  • Incremental, the WAL archives include everything necessary to restore the current state of the database
  • Almost no overhead, copying WAL files is cheap
  • You can restore the database at any point in time (this feature is called PITR, or point-in-time recovery)

Disadvantages:

  • More complicated to set up than pg_dump
  • The full backup will be much larger than a pg_dump because all internal table structures and indexes are included
  • Does not work well for write-heavy databases, since recovery will take a long time.

There are some tools such as pitrtools and omnipitr that can simplify setting up and restoring these configurations. But I haven't used them myself.

Tsutomu
  • 4,848
  • 1
  • 46
  • 68
intgr
  • 19,834
  • 5
  • 59
  • 69
  • 1
    Just playing with this myself now and finding that even on windows (although most docs are linux heavy) this is not too bad without the tools – Daniel Casserly Apr 03 '12 at 21:31
  • 3
    Since 9.1 there is pg_basebackup - www.postgresql.org/docs/9.4/static/app-pgbasebackup.html. It works with servers of the same or an older major version, down to 9.1. However, WAL streaming mode (-X stream) only works with server version 9.3 and later. – Ross Jan 04 '16 at 23:46
18

Also check out http://www.pgbackrest.org

pgBackrest is another backup tool for PostgreSQL which you should be evaluating as it supports:

  • parallel backup (tested to scale almost linearly up to 32 cores but can probably go much farther..)
  • compressed-at-rest backups
  • incremental and differential (compressed!) backups
  • streaming compression (data is compressed only once at the source and then transferred across the network and stored)
  • parallel, delta restore (ability to update an older copy to the latest)
  • Fully supports tablespaces
  • Backup rotation and archive expiration
  • Ability to resume backups which failed for some reason
  • etc, etc..
vishesh
  • 113
  • 1
  • 7
Stephen Frost
  • 680
  • 4
  • 7
  • how to backup from standby ? I have /etc/pgbackrest.conf saying that backup-standby=y ,archive-async=y and remotebacup ini config tag in which contain db-port=5433 and db-host=lanip(192.168.0.3). In every times invoking /opt/perlroot/bin/perl /opt/perlroot/bin/pgbackrest --stanza=remotebackup stanza-create got error 124 remote process terminated on 192.168.0.3 exit status 255 : Permission denied public key . But strangely invoking from standby sudo -u postgres ssh 192.168.0.3 works fine. Adding cmd-ssh=/usr/bin/sudo -u postgres /usr/bin/ssh -vv didn't work either :( – user3453753 Jan 28 '17 at 06:22
  • Backing up from standby is covered in the pgbackrest documentation and mainly just involves setting up the replicas in the pgbackrest.conf file. As for your permission denied issue, I can't really speak to it but you likely have a misconfiguration on your system regarding the SSH keys. – Stephen Frost Dec 19 '17 at 19:02
0

Another method is to backup to plain text and use rdiff to create incremental diffs.

Scott Marlowe
  • 8,490
  • 3
  • 23
  • 21
  • 12
    Can't imagine doing that for a 5G database, let alone 50G ones. – Jerry May 09 '12 at 14:00
  • 2
    I've used it on far bigger dbs than 50G in the past using snapshots of the data dir itself. But yeah, once a plain test backup starts getting big and unwieldy it's a good idea to used some other method. – Scott Marlowe May 10 '12 at 02:24
  • 1
    I don't know how barman works exactly, but WAL archiving does not save indexes as far as I know (http://www.postgresql.org/docs/9.2/static/continuous-archiving.html#CONTINUOUS-ARCHIVING-CAVEATS). – Bladt Sep 12 '14 at 20:45
  • @ScottMarlowe, for what it's worth, have you tried a binary diff on the custom dump output format? – Asclepius Jun 18 '15 at 23:47
  • Surely rsync is better. Diff backups are not required here, just the ability to backup only what has changed.. ala rsync. – Ross Jan 04 '16 at 23:49
  • 2
    @Bladt WAL archiving does not save HASH indexes. The default index type is a B-Tree. Hash indexes are a special type. – Doug Nov 29 '16 at 14:50
  • @Doug Nice catch! – Bladt Nov 30 '16 at 14:47
  • 1
    FYI: As of PG10, hash indices are wal-logged – DylanYoung Mar 17 '20 at 14:23