8

I'm working on an embedded Linux video recorder application which writes MP4 format video to a file (on FAT format SD card).

Some complicating factors are that the video and audio data come from hardware codecs which have to be serviced with low latency, and must write into DMA-capable buffers.

At the moment for the output file I use open() and write(), but find that write() can take hundreds of milliseconds to return when the system is under load, so my writes are done in a separate thread.

As it stands I copy data from the (small, limited number) DMA buffers to a multi-megabyte malloc'd circular buffer, then write() from that in another thread. This means I'm doing at least two copies, once into the app buffer, and once into the system buffer cache.

I am considering trying O_DIRECT writes to avoid a copy, but am interested in any comments. I note that Robert Love comments that O_DIRECT is terrible but does not say why.

On the flip side, I would also be interested if anyone knows a way to get write() to not stall for longish periods of time (AIO?), then I could use the buffer cache as Linus intended.

This question is not unrelated to my very old question about write stalls.

Community
  • 1
  • 1
blueshift
  • 6,742
  • 2
  • 39
  • 63
  • 1
    SD card write performance can vary greatly depending on the manufacturer, operating system, sector size, ect. Maybe you've already done this, but are your writes done on good sized boundaries (e.g. 64K at a time)? Have you tried pre-allocating the file on the SD card to a size larger than needed so that the sectors have already been reserved in the FAT? – BitBank Mar 16 '12 at 16:19
  • @BitBank I do binary-round writes but no prealloc yet, was thinking of giving it a try. Have you some experience with that? – blueshift Mar 17 '12 at 10:49
  • I read somewhere about SD write delays due to FAT sector allocation. – BitBank Mar 17 '12 at 12:55

1 Answers1

2

If this is really an embedded product in that you control your driver source, I'd seriously looking into mmap'ping the memory to allow the user process to access the same memory as the device driver and avoid all those copies.

And here is a sample implementation of driver memory being shared with a userspace process using mmap.

idlethread
  • 1,111
  • 7
  • 16
  • The process does mmap() the incoming DMA buffers. Are you suggesting to use mmap() for the write side as an alternative to O_DIRECT? Interested if you have any experience or documentation about that use. – blueshift Mar 20 '12 at 03:28
  • Basically, the driver would use remap_pfn_range to map its memory for the userspace. I've added a link to a sample I found through Google. – idlethread Mar 21 '12 at 01:08