0

I want to back up 100 TBs of data. The backup task should take 30 days given my computation power and bandwidth. But the data are not stagnant. During the progress of back up, data will be modified/created/deleted within the source directory.

Question is will duply/duplicity back up the state of files when a file is read or when backup is initiated?

tash
  • 711
  • 5
  • 13

1 Answers1

1

duplicity will use the file state at the point in time when the file is processed during the backup.

Note: as a user application duplicity is not capable to enforce file system consistency, meaning if the file is readable, but currently open in another application and written only partially, this inconsistent state will be backed up.

Suggestions

  1. use a files system that is snapshot capable and backup those
  2. stop services/software that might write data to be backed up, to retrieve a consistent state beforehand
  3. duplicity was never developed for data sets this huge. you may run into trouble.
  4. for big data sets a strategy to backup to a local file system and mirror that to a cloud location later might improve performance a lot.
ede-duply.net
  • 518
  • 2
  • 5
  • What problems do you foresee with data sets that huge? – tash Jan 17 '23 at 18:28
  • 1
    duplicity just was never optimized against huge data. while being a long time supporter i have to admit that it has weaknesses. 1. manifests may grow to unsuitable sizes (they are currently not split like volumes) 2. bitrot at a point in a backup chain may result in diffilculty to restore later backups – ede-duply.net Jan 18 '23 at 23:13
  • Thanks, yes I saw the manifest and signatures files grow large. – tash Jan 18 '23 at 23:46