3

I would like to use an application API that is not "crash safe"; in other words, there is a high likelihood of the data file being corrupt and unreadable if the application crashes.

The file itself is a "metadata file" and should not get very big: few 100s of MB maximum.

What I want to do is:

  1. Force the application to access the file in "direct mode" (no OS caching).
  2. Pause updates at regular "checkpoint" intervals
  3. Perform a flush() (some data probably got flushed automatically)
  4. Now that I know the file is consistent, clone it.
  5. If there is an "old clone" delete it.
  6. Resume doing changes to the original file.
  7. Loop.

Could I use a special-purpose file system that makes some kind of "zero copy" of the file, combined with copy-on-write of the modified sectors of the original file, to get the clone "almost free" (with minimum disk IO)?

Also, can I do the "clone" without having to fork a process? (I don't know if the Linux file API offers a "cp" system-call).

monster
  • 618
  • 3
  • 10
  • 17
  • 4
    You could use LVM snapshotting for this instead of cloning. If something goes wrong, just copy the file from the clone. – AndreasM Feb 13 '12 at 12:08
  • 1
    Create LVM volume for this file only so performance penalty of LVM snapshot don't affect other files. I would say that BTRFS is not ready for production now. – kupson Feb 13 '12 at 12:45
  • @AndreasM You should say it as an "answer". This sounds like a good idea, but I can't "accept" a comment (though I will wait a few hours to see if anything else comes up). – monster Feb 13 '12 at 12:58

3 Answers3

6

You could use LVM snapshotting for this instead of cloning. If something goes wrong, just copy the file from the clone.

There is a libdevmapper/libdevmapper-event-lvm2snapshot which could be helpful in doing this programmatically (without a fork): http://sourceware.org/dm/

Edit:

If you can change your program here is another solution: https://stackoverflow.com/questions/1565177/can-i-do-a-copy-on-write-memcpy-in-linux

mmap() the file twice, once normally and once with MAP_PRIVATE.

This would avoid the externalities (esp performance) of lvm

AndreasM
  • 1,083
  • 8
  • 13
  • 1
    According to this Blog: http://johnleach.co.uk/words/613/lvm-snapshot-performance LVM snapshots can be very slow, which might not be a problem in my case, but can also *slow down non-snapshoted volumes*, which sounds worrying. Since I will permanently have a (small) snapshot, I'll have to do some benchmarks. – monster Feb 13 '12 at 14:35
4

Here's a quick suggestion that won't involve LVM. Use R1Soft Hot Copy to take one or multiple point-in-time snapshot of the filesystem in question. See the tips page. It uses copy-on-write technology. This has been a solution to some similar questions here, but also applies to what you're looking to do.

ewwhite
  • 197,159
  • 92
  • 443
  • 809
  • For my purpose, it seems equivalent to LVM, but the "Outstanding performance compared to LVM snapshots" promise does sound attractive, and it seems to be also free for my use-case. Thanks. – monster Feb 13 '12 at 14:21
  • It's free and doesn't require formatting your partitions as LVM. That's the main thing for me, as I tend not to use LVM in my deployments. Again, the copy-on-write functionality is also consistent with what you were looking for. – ewwhite Feb 13 '12 at 15:26
  • Sounds good, probably the best solution for your use case. Is it open source? – AndreasM Feb 13 '12 at 17:41
  • 1
    Free, but closed-source. – ewwhite Feb 13 '12 at 17:44
3
  • Btrfs × cp --reflink or snapshots
  • Nilfs — by design AFAIU
  • ZFS "on Linux" (some ppl say it works fine for them) — snapshots
poige
  • 9,448
  • 2
  • 25
  • 52