5

I'm writing C code with some real-time constraints. I tested out the speed I can write to a disk with dd:

dd if=/dev/zero of=/dev/sdb bs=32K count=32768 oflag=direct

This writes 1GB of zeros to /dev/sdb in 32K block sizes

I reach about 103 MB/s with this

Now I programmatically do something similar:

open("/dev/sdb",O_WRONLY|O_CREAT|O_DIRECT|O_TRUNC, 0666);

I get a timestamp value write from a 32K buffer to /dev/sdb 10,000 times (in a for loop) get another timestamp value do a bit of number crunching to get the rate in MB/s and it is about 49 MB/s

Why can't I reach the same speed as dd? An strace reveals the same open command that I use.

dschatz
  • 1,188
  • 2
  • 13
  • 25

2 Answers2

5

Check what system calls dd makes, not just the open but also the subsequent reads and writes. Using the right buffer sizes can make a significant difference in this kind of large copy. Note that /dev/zero is not a good test for benchmarking if your final goal is a disk-to-disk copy.

If you can't match dd's speed by matching it system call for system call... well, read the source.

Gilles 'SO- stop being evil'
  • 104,111
  • 38
  • 209
  • 254
  • I am actually interested in direct writing from memory to /dev/sdb so I feel /dev/zero should work pretty well. Also what are you talking about with regards to the reads and writes? I specify the block size in the command to be 32K. – dschatz Aug 13 '10 at 19:05
  • 1
    The source is 2000 lines. Not exactly a magnum opus. Just check it out: http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/dd.c. It seems it uses read, write, memcpy, memset. Nothing magic there. It does seem to have a few strategies for reading/writing, though some of those variations seem to be designed for special filesystem/OS requirements. – Merlyn Morgan-Graham Aug 13 '10 at 19:31
  • I have looked at it and I don't see any differences: fd_reopen (STDOUT_FILENO, output_file, O_WRONLY | opts, perms) nread = iread_fnc (STDIN_FILENO, ibuf, input_blocksize); size_t nwritten = iwrite (STDOUT_FILENO, obuf, n_bytes_read); If anyone can tell me what it does that I'm not seeing then I would be grateful. – dschatz Aug 13 '10 at 19:36
  • You don't even need `dd`'s source to see the syscalls - that's what `strace` is for – qrdl Aug 13 '10 at 19:42
  • @dschatz: again, did you check that your code makes the exact same sequence of `read` and `write` calls (diff their `strace`s)? If they do, and if you're sure you've eliminated any caching effect in your benchmark, then you'll have to study the source harder. Try copying part of the code of `dd` and seeing if that helps. It might not be easy to find the clincher(s)! – Gilles 'SO- stop being evil' Aug 13 '10 at 19:56
0

I'm leaving the part about matching the system calls to somebody else. This answer is about the buffering part.

Try benchmarking the buffer size you use. Experiment with a range of values.

When learning Java, I wrote a simple clone of 'copy' and then tried to match it's speed. Since the code did byte-by-byte read/writes the buffer size was what really made the difference. I wasn't buffering it myself but I was asking the read to fetch chunks of a given size. The bigger the chunk, the faster it went - up to a point.

As for using 32K block size, remember that the OS still uses separate IO buffers for user-mode processes. Even if you are doing something with specific hardware, i.e. you're writing a driver for a device that has some physical limitation, e.g. a CD-RW drive with sector sizes, the block size is only part of the story. The OS will still have it's buffer too.

Kelly S. French
  • 12,198
  • 10
  • 63
  • 93
  • The buffer is 32k which is the same as the block size I use in dd. These translate into the same system calls so what else is there to experiment with? Also I open both with the direct flag so the OS will not buffer it. – dschatz Aug 13 '10 at 19:24