1

I'm looking for a filesystem that can replicate over long distances, and tolerate being offline for extended periods of time by using a local buffer, which should be a disk buffer, to queue up changes to replicate.

DRBD was an ideal candidate with DRBD Proxy, but it buffers in RAM. I'm not sure that will be adequate.

I'm trying to avoid things like Ceph which have much more functionality than needed.

It should handle on the order of a billion files on a single filesystem, and need only replicate from filesystem A to filesystem B. There will be a lot of files, but they will only be written and not changed. A moderate amount of data will be written all the time, but not so much that it won't be perfectly feasible for replication to catch up after even a few days of being offline. No clustering or anything fancy like that is required.

Really, what I'm looking for is something that works like MySQL replication, but for a file system.

I found a lot of commentary on replicating file systems, but for me the missing piece is being able to buffer updates to disk if the link is down for an extended period.

Gerber
  • 111
  • 3
  • Do you need a file system with built in replication, or a file system that can cope with a billion files and software to do replication? – Tim Apr 12 '17 at 20:54
  • Well, I don't want to use something like rsync on a billion files, and I can't build replication into the application level. So I was thinking that replication would have to be at the file system level. If there's another option, I'm interested. – Gerber Apr 12 '17 at 20:58
  • 2
    you do know that DRBD tracks it's changes in a bitmap, so it's not like you're storing all the files that have changed in RAM. Once your node reconnects (could be years later), both sides exchange a bitmap and sync the changes. – Matt Kereczman Apr 12 '17 at 21:26
  • Write once, never change, asynchronous replication. A solution could be made using a message queue with information about files that are new / changed, plus a transport system. Or you could use S3, it's highly scalable, highly durable and mostly available. There should be something existing that meets your needs though. – Tim Apr 12 '17 at 21:44
  • Matt Kereczman that is probably the most useful piece of information. So DRBD Proxy on a box with 256GB RAM could probably queue up a really massive amount of changes. Thanks! – Gerber Apr 13 '17 at 16:52

2 Answers2

3

There is a fully asynchronous kernel-level replication, based on trransaction logfiles: https://github.com/schoebel/mars

  • 2
    Welcome to Server Fault! I guess this is your own project; you must explicitly state so, as explained in the [Help Center](/help/promotion). – Glorfindel Jul 02 '19 at 19:55
0

Perhaps using zfs send/receive do the trick?

I have been using zfs on linux for years now to achieve something like this.

I can imagine a sort of loop that creates a snapshot, then sends it over the wire if it fails will try again with ever growing times in between retries.

You could even separate the snapshotting process from the replication process to keep smaller increments to improve resilience for networking failures while sending the updates.

  • 1
    ZFS replication looks interesting. [Article One](https://arstechnica.com/information-technology/2015/12/rsync-net-zfs-replication-to-the-cloud-is-finally-here-and-its-fast/) [Article 2](http://www.bolthole.com/solaris/zrep/) – Tim Apr 12 '17 at 23:30