1

What I'm doing

I'm using Borg backup v1.1.0b6 on my production AWS Linux server. Borg is meant to be block based, deduplicating, and incremental. I've also tried borg-linux64 v1.1.6 with the same results.

My Observation

I'm finding that instead of getting incremental de-duplicated backups it seems to do a full backup each time I run it. It creates a new file with all the data from my source folders and truncates all old backup files rather than creating new files for the new data and keeping the existing files with backup data.

I would expect when I run a 'prune' operation, removing files outside my retention schedule, then I would expect significant change on the file system.

The Key Problem

The key problem is I have to upload my entire set of data to offsite storage each night.

The Question

Am I using it incorrectly? Are my assumptions incorrect? How do I get borg to create a new file for new data each day, without copying all the old data into it?

Example

For example, here I've created a new backup repository

/usr/local/bin/borg init /tmp/test -e none

I do an initial run

/usr/local/bin/borg create --stats /tmp/test::1 /var/testfiles

Which creates these files

-rw------- 1 root root       17 Jun 27 20:09 1
-rw------- 1 root root       17 Jun 27 20:10 3
-rw------- 1 root root 23842026 Jun 27 20:10 4
-rw------- 1 root root       17 Jun 27 20:10 5

With this output


Archive name: 1
Time (start): Wed, 2018-06-27 20:10:45
Time (end):   Wed, 2018-06-27 20:10:46
Duration: 0.57 seconds
Number of files: 150
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:               25.43 MB             24.10 MB             23.84 MB
All archives:               25.43 MB             24.10 MB             23.84 MB

                       Unique chunks         Total chunks
Chunk index:                     155                  160

I run the create command again with no changes to the data directory. Note that instead of putting just new blocks into a new file it's deleted file '4' and created a new file.

/usr/local/bin/borg create --stats /tmp/test::2 /var/testfiles

Backup folder

-rw------- 1 root root       17 Jun 27 20:09 1
-rw------- 1 root root       17 Jun 27 20:10 3
-rw------- 1 root root       17 Jun 27 20:10 5
-rw------- 1 root root       17 Jun 27 20:11 7
-rw------- 1 root root 23842579 Jun 27 20:11 8
-rw------- 1 root root       17 Jun 27 20:11 9

Output run #2

------------------------------------------------------------------------------
Archive name: 2
Time (start): Wed, 2018-06-27 20:11:14
Time (end):   Wed, 2018-06-27 20:11:14
Duration: 0.04 seconds
Number of files: 150
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:               25.43 MB             24.10 MB                460 B
All archives:               50.86 MB             48.19 MB             23.84 MB

                       Unique chunks         Total chunks
Chunk index:                     156                  320
------------------------------------------------------------------------------

I then added a 1MB file to the backup folder and run the backup again. Again, the file with data '8' has been deleted and a new file '12' made.

/usr/local/bin/borg create --stats /tmp/test::3 /var/testfiles

Backup folder

-rw------- 1 root root       17 Jun 27 20:09 1
-rw------- 1 root root       17 Jun 27 20:10 3
-rw------- 1 root root       17 Jun 27 20:10 5
-rw------- 1 root root       17 Jun 27 20:11 7
-rw------- 1 root root       17 Jun 27 20:11 9
-rw------- 1 root root       17 Jun 27 20:15 11
-rw------- 1 root root 24916076 Jun 27 20:15 12
-rw------- 1 root root       17 Jun 27 20:15 13

Output run #3

------------------------------------------------------------------------------
Archive name: 3
Time (start): Wed, 2018-06-27 20:15:34
Time (end):   Wed, 2018-06-27 20:15:34
Duration: 0.06 seconds
Number of files: 151
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:               26.61 MB             25.16 MB              1.07 MB
All archives:               77.47 MB             73.35 MB             24.91 MB

                       Unique chunks         Total chunks
Chunk index:                     159                  481
------------------------------------------------------------------------------

What I expect is files that look more like this - backup one creates the file '4' with 23MB of data, the second backup does nothing, the third backup adds around 1MB of extra data which goes into a new file.

-rw------- 1 root root       17 Jun 27 20:09 1
-rw------- 1 root root       17 Jun 27 20:10 3
-rw------- 1 root root 23842026 Jun 27 20:10 4
-rw------- 1 root root       17 Jun 27 20:10 5
-rw------- 1 root root       17 Jun 27 20:11 7
-rw------- 1 root root       17 Jun 27 20:11 9
-rw------- 1 root root       17 Jun 27 20:15 11
-rw------- 1 root root 1000000 Jun 27 20:15 12
-rw------- 1 root root       17 Jun 27 20:15 13
Tim
  • 31,888
  • 7
  • 52
  • 78
  • Something to do with: http://borgbackup.readthedocs.io/en/stable/faq.html#it-always-chunks-all-my-files-even-unchanged-ones ? – Lenniey Jun 27 '18 at 15:00
  • Thanks Lenniey, but based on my read I don't think it's that. I suspect Borg just doesn't work the way I expect, so I might have to find another backup tool to reduce bandwidth. – Tim Jun 27 '18 at 19:17

1 Answers1

4

What you see is the effect of borg compacting segments.

Your test backup size being small triggers a relatively big effect - it is not (relatively) that big if you use more data (try e.g. 10GB).

I am working on improving the compaction behaviour, you can see it there:

https://github.com/borgbackup/borg/pull/3925

  • Thanks Thomas, that's really useful / interesting. It'd be good if it could consider bandwidth for backups. For now I'll probably switch to Restic backup, which stores data in 5MB blocks, so shouldn't have this behavior. I do like Borg though, seems like good technology and well maintained, it's just not quite right for my specific use. – Tim Jul 08 '18 at 01:12