How to ensure that data written from application is synced to disk with ext4 in OpenEBS?

Question

The data is not visible after the restart. I am guessing that data is passed for replica to write, which in turn puts into ext4. But before ext4 can sync the data to disk, node reboots and data is not pushed down to EBS disks.

Is there a way out of this? I am using openebs with jiva. I have my MySQL -> ext4 (iscsi volume) -> Replica -> ext4 (block disks - say Amazon EBS).

At times, I am observing that if node where Replica is running restarts...

Is the restart an abrupt restart (i.e. a crash/boot) or a friendly restart? — Anon, Oct 05 '18 at 02:39

score 1 · Answer 1 · edited Oct 05 '18 at 09:01

Here is an article on lwn.net that discusses potential data loss when a program fails to do adequate syncing (otherwise known as crash-consistency) on ext4 at length (the comments discussion is enlightening as well).

ext3 apparently achieves a better crash consistency when using data=ordered because it forces data to disk before metadata changes are committed to journal. Also, a default commit period of 5 seconds is used. In case of ext4, a trade-off for performance is done which uses a delayed physical block allocation model thus causing uncommitted data to continue to living in the cache for some time. A quote from the article:

The kernel doesn't like to let file data sit unwritten for too long, but it can still take a minute or so (with the default settings) for that data to be flushed - far longer than the five seconds normally seen with ext3

So unwritten data can theoretically exist only in a volatile cache until it is forced to disk by a system wide sync OR an application's explicit fsync of its own data (as Jeffery has pointed out). If the application/client doesn't do this we are more prone to data loss.

One way of mitigating this issue is to mount the required filesystem with the sync option (refer this "ext4 and data loss" discussion thread) and to do so we have to mandate it in two places:

The mount into the pod
The OpenEBS storage pool OR the backend store

(In case of 1, we could have the target convert all writes to sync, as explained by Jeffery)

While the mount(8) documentation specifically states that using -o sync is only supported until ext3 (among the ext family of filesystems), a manual filesystem mount with this option is accepted. In an attempt to check whether it is something that the mount protocol allows but is ignored by ext4, I conducted a small fio-based random write performance test for a data sample size of 256M with a disk mounted with the sync option and the same with one without it. To ensure that the writes themselves were not SYNC writes, the libaio ioengine was selected with direct=1 and iodepth=4 (asynchronous multithreaded unbuffered I/O). The results showed a difference of around 300+ IOPS (of course, with the non sync mount performing better). This result suggests that the sync mount flag does seem to play a role but I'm still looking for more proof on this.

How to ensure that data written from application is synced to disk with ext4 in OpenEBS?

1 Answers1