3

I have an image stream coming in from a camera at about 100 frames/second, with each image being about 2 MB. Now just because of the disk write speed I know I can't write each frame, so I'm only trying to save about a third of those frames each second.

The stream is a circular buffer of large char arrays. And right now I'm using fwrite to dump each array to a temporary file as it gets buffered, but it only seems to be writing at about 20-30 MB/s while the hard drive should theoretically go up to 80-100 MB/s

Any thoughts? Is there a faster way to write than fwrite() or a way to optimize it? More generally what is the fastest way to dump large amounts of a data to a standard hard drive?

user1359341
  • 121
  • 1
  • 3
  • 5
  • 2
    Use an OS specific call such as `write` on *nix systems to get unbuffered calls? – dirkgently May 27 '12 at 00:06
  • 1
    Just a hint, but preallocating the file might help; write a few gigabytes of `NUL` bytes to a file (and don't just `fseek()` and write at the end, that'll be [sparse](http://en.wikipedia.org/wiki/Sparse_file)), so the filesystem doesn't need to find a place for the blocks while you're receiving data from the camera. – Asherah May 27 '12 at 00:07
  • @dirkgently Wouldn't using a primitive without buffering be less efficient unless you are passing a chunk of data which is precisely a multiple of the sector size? – SJuan76 May 27 '12 at 00:14
  • @SJuan76: That'd be ideal. But the general `FILE`/`ofstream` will probably not be optimized for any particular system. At least, with a primitive, the OP has some semblance of a chance of extracting a fair bit of system specific advantage. – dirkgently May 27 '12 at 00:17
  • http://stackoverflow.com/questions/2380071/all-things-equal-what-is-the-fastest-way-to-output-data-to-disk-in-c – Ken White May 27 '12 at 00:28
  • 1
    What are the values of the 2nd and 3rd arguments to your `fwrite` calls? `fwrite`ing 1 byte at a time will go a lot slower than `fwrite`ing 64K at a time. Also, are you calling `fopen` for **each** frame? Writing all of the frames to a single file might go faster. – Robᵩ May 27 '12 at 01:34
  • Are you sure your performance is limited by disk I/O? `fwrite` itself is quite efficient (compared to say `ofstream`), but your other processing might well be CPU-bound. – Ben Voigt May 27 '12 at 05:01
  • @dirkgently I tried this, it went slower for some reason, think it doesn't really matter with writes this large. – user1359341 May 27 '12 at 20:13
  • @Robᵩ: the line is fwrite(pBuf,sizeof(unsigned char),BufSize,pFile); Where BufSize is the size of each array in the buffer (about 1000x1000) – user1359341 May 27 '12 at 20:16
  • @Ben Voight: There is no processing, I'm just dumping the buffer to disk. Loading each image into the buffer takes about 2-3 ms, which shouldn't put a noticeable dent in the write speed. – user1359341 May 27 '12 at 20:18

3 Answers3

1

What if you'll use memory mapped files limited to, for example, 1GB each? This should provide enough speed and buffer to work with all frames, especially if you'll manage to perform zero-copy frame allocation.

Forgottn
  • 563
  • 3
  • 11
  • From what I've read memory-mapped files are best for multiple read and writes to the same file. But the file has to be small enough to fit in memory correct? I need to write tens to hundred of gigabyte. – user1359341 May 27 '12 at 20:12
  • Well, check `mmap` restrictions, if you're using POSIX system. As far as I can understand there is a limitation to the maximum bytes simultaneously mapped into memory. For Windows systems using `CreateFileMapping` you can create an unlimited size file, but when you're working with the memory itself you should reserve it with `MapViewOfFile` and this is limited to the application's space address. Note: Both systems prefer sizes aligned to the page size. – Forgottn May 29 '12 at 17:31
0

fwrite is buffered, which is what you want. Though with that big files/writes it shouldn't make much or any difference. Maybe experiment with a larger stream buffer with the setbuf call.

Since you are limited by physical disk i/o speeds, as long as you are making it as easy as possible for the system to use each available disk io efficiently there's not really more you can do.

vmstat on linux (other similar tools on other systems) can tell you how many disk i/os your disk is doing, so you can test if your changes help anything.

Ask Bjørn Hansen
  • 6,784
  • 2
  • 26
  • 40
0

Asynchronous non-buffered output is a key to success in your case. Buffered IO will only cause double-buffering overhead and sync IO will make HDD heads missing sequential sectors.

Boost.Asio provides a relatively good encapsulation of system-specific APIs for popular platforms.

There are few things to remember:

  • on most non-Windows platforms you will have to write to raw partitions go get system's bufferization and internal threading out of the way.
  • keep the write queue non-empty all the time, so the SATA controller can help you by means of NCQ.
  • pay attention to system-specific requirements to buffer alignment and size for async non-buffered IO to work.
  • file open mode is also important to make the system to do what you want.
Krit
  • 548
  • 3
  • 9
  • Would asynchronous I/O apply here? I've looked into it before but it always seemed overwhelming, especially Boost.Asio which has next to no documentation. It seemed like it would only help if there was processing to be done between writing, so that could take place while the write was occuring. But since all 'm doing is writing the buffer, would this help? – user1359341 May 27 '12 at 20:10
  • @user1359341 - well, if your buffers are relatively big (hundreds of MB), then you may reach top write speeds of your HD even with blocking IO (given that you do only one write at any given moment). But how would you make your system self-balanced? For example, you want to make the frame-drop rate a function of the output stream speed. And this speed may vary for HDD depending on what track you write, or it may significantly change if you set-up a RAID, or upgrade to SSD etc. So the mid-size AIO event-driven architecture is the right choice. – Krit May 28 '12 at 17:27