0

fsync/fdatasync calls are expensive, but they are essential in databases as it allows for durability in ACID. As I've tested, when there is only one process do some writing and fsync periodically, every fsync takes about 50ms. But when there are multiple processes doing the same thing, say two of them, sometimes(maybe 50%) fsync takes a significant amount of time, hundreds or thousands of milliseconds, even tens of seconds, apparently the system becomes unusable in this circumstance. I would like to know how is this problem solved in databases, especially in distributed time-series databases, when there are multiple nodes using a single hdd.

Ju Piece
  • 249
  • 3
  • 9
  • _Distributed_ databases **never** run on a single HDD, except if you create a network share from your desktop, have that share mounted on all the server nodes an tell the databases to use that share. So, sorry, but that doesn't make sense. – Ancoron May 27 '19 at 23:46
  • Sorry for not clarifying, here I mean pseudo-distributed cluster, where all the nodes are running on the same machine. – Ju Piece May 28 '19 at 02:07
  • Why should you do that? But anyway, they'll compete for the resources. It's like attempting to drive with multiple cars next to each other on a single lane. The lane is not going to get bigger. And we all know how traffic jams begin. SSD can help here a lot, as it can handle more IO-ops in parallel. But if that speeds up a system running multiple instances vs. a single on the same node, it only means that either you need to tune the database or it simply can't take advantage of the storages' capabilities. – Ancoron May 28 '19 at 08:10
  • But back to your problem: databases cannot "solve" issues in lower layers for which they are not responsible for. If I/O requests hang in the operating system call or the hardware, there is nothing a database can do. The bad news also is that most of the I/O operations are still blocking the caller as they also have to look like being atomic. Today, an "fsync" does not necessarily imply that the bytes are physically on-the-platter when the call returns. – Ancoron May 28 '19 at 08:18
  • In addition, databases tend to collect I/O writes for a certain time to reorganize them and reduce the total number of I/O operations they have to send to the file system. Modern file systems also do the same thing on their side (e.g. see parameter "commit" for ext4). If two or more DB instances now do a lot of writes around the same time, that can also very easily "overwhelm" the disk with write operations, especially if there are reads in the waiting queue as well. But here, we're back at the traffic jam problem. – Ancoron May 28 '19 at 08:48

0 Answers0