4

Bazaar limits the file size that it can commit based on the available virtual memory (according to an open bug).

I would like to put a database (as mysqldump text file) under version control. The database is 3 GB and I am working on a server with 64GB of memory. I do not understand why this would be a problem. When I attempt to commit, I get the error reported in the bug:

bzr: ERROR: exceptions.OverflowError: requested number of bytes is more than a Python string can hold

Is there a way that I can get this file under bazaar version control?

My preference for bazaar is that I am familiar with it, but I plan to automate the dump and check in as a cron job so any suitable version control system will do.


Two options I have come up with so far, until a better solution appears. Currently, I keep a copy of each weekly dump backup, storage is not an issue at this point. Otherwise, I could keep the first dump, diff the original and new version of the dump, and version control that. This would keep a record of the changes, but it would not be possible to return to an earlier state. I am not comfortable with this unless there would be a straightforward way to revert.

mysqldump mydb > mydb_base.sql
touch mydb_diff
bzr add mydb_diff
bzr commit -m 'first commit'

then in the cron script

mysqldump mydb > mydb.sql
diff mydb_base.sql mydb.sql > mydb_diff
bzr commit -m "`date +%Y.%m.%d-%H.%M` mydb diff" mydb_diff
David LeBauer
  • 31,011
  • 31
  • 115
  • 189

4 Answers4

2

Although I love Bazaar, I would not recommend that you use it for this purpose, unless you definitely need to keep every version of your data indefinitely. If in stead you only need a fixed number of past versions (e.g. 10) or only for a fixed period (eg. 2 years), then I'd suggest that you use an incremental backup tool like rdiff-backup.

AmanicA
  • 4,659
  • 1
  • 34
  • 49
  • thanks. that sounds like a better tool for the job, I think keeping backups from a series like 1 day, 1 week, 1 month, 1 year, would work, e.g., I can get rid of old versions as I go along. – David LeBauer Nov 07 '11 at 01:03
  • It looks like that feature is not available yet :( http://wiki.rdiff-backup.org/wiki/index.php/DeleteIntermediate – AmanicA Nov 07 '11 at 16:12
1

This is a bug in bzr, fixed in recent versions. See https://bugs.launchpad.net/bzr/+bug/683191

jelmer
  • 2,405
  • 14
  • 27
0

If "Foreign VCS" suggestions are applicable, I have to say

Lazy Badger
  • 94,711
  • 9
  • 78
  • 110
  • My understanding is that largefile stores the full version of each file. Also, does Mercurial 2.0 actually support files larger than 2Gb in its dirstate file yet? – jelmer Nov 06 '11 at 18:13
  • stores the full version of each file, yes, but only in one location and not inside repo - "Files added as largefiles are not tracked directly by Mercurial; rather, their revisions are identified by a checksum, and Mercurial tracks these checksums... A largefiles store, typically on a centralized server, has every past revision of every largefile. The local repository has a largefile store in '.hg/largefiles' which holds a subset of the largefiles needed." Yes, current limit is 2G - "Unfortunately, the use of dirstate limits largefiles to 2 GB. " – Lazy Badger Nov 06 '11 at 19:00
-2

First, you want to b*ackup things* with bazaar more than using a versionning system. That's different ! Bazaar stores changeset - differences between files - and keep tracker of them and their merge.
Even if I can understand your point, I think your are not using the right tool.

Anyway, to answer your question, a simple workaround is to reduce the file size. As a mysqldump is a text file, its size can be dramatically reduced by compressing it.

So a possible workaround is to compress it with 7zip (or whatever you want) before sending it to bazaar.

This way prevents you from doing real merge or diff from bazaar, but I understand that it is not your priority.

TridenT
  • 4,879
  • 1
  • 32
  • 56
  • 2
    Compressing it usually isn't a a good idea, since that will actually make it harder for bzr to generate a delta against the previous version of the file. Bazaar will store the files in a compressed manner anyway. – jelmer Nov 06 '11 at 18:10
  • I know that it is not a good idea, as I mentioned, but it is better than not working at all ! – TridenT Nov 06 '11 at 18:44
  • Fair enough, it's a workaround for bug 683191 (though that's also fixed now). It'll lead to more disk usage though. – jelmer Nov 06 '11 at 19:10