4

Can anyone point me to some (preferably commonly-used) applications that use kernel AIO (i.e., io_submit() family), like any SQL/no-SQL database, etc.? I want it to be able to issue asynchronous reads with queue-depths of more than 1 on each thread to fully saturate a highly parallel SSD that supports 64> in-flight requests without noticeable degrade.

I am aware of InnoDB, but am looking for something simpler (possibly a KV store).

Update: I am not looking for sample codes, or synthetic benchmarks like fio+libaio. I am interested to find a set of applications that can saturate the device in a more realistic setting.

  • 1
    You're aware right you don't need async i/o to achieve this? Large i/o blocks are broken into chunks to increase queue depth for you - indeed PCIe is incapable of bursting more than 4Kb at once, so that's your maximum i/o block no matter what. Also, most non-direct i/o async file i/o kernel implementations (except for Windows) are implemented either as threads in user space or in kernel space, so you'll get better performance using a thread pool on those systems. Most databases provide a thread pool based backend for that reason. – Niall Douglas May 06 '16 at 23:23
  • Let me put this another way: launching four copies of the 'dd' command copying data around will easily saturate any SSD, even the top end PCIe ones. Queue depth rapidly maxes out. – Niall Douglas May 06 '16 at 23:25

1 Answers1

3

One simple example of io_submit/io_getevents is in fio, a program that's an enormous help for testing and profiling block devices. It has a number of different I/O backends to support different OS's and different access techniques. The Linux AIO wrapper is in the author's repository on github: https://github.com/axboe/fio/blob/master/engines/libaio.c

The fio code is simple, but it's missing the eventfd integration, which you'll likely need. (I always have.) For this, the slightly more complicated, but straightforward code in QEMU's block layer provides a good example: https://github.com/qemu/qemu/blob/331ac65963ab74dd84659b748affa0b111486f06/block/linux-aio.c

You may find you can saturate your SSD from a single thread! Or, at least, it's worth testing. Fio can give you a good idea of what kind of throughput you can expect before you write the code. You can even configure it to do multiple threads.

Mike Andrews
  • 3,045
  • 18
  • 28
  • Thanks @gubblebozer. I know about `fio` and that is how I know the device can serve up to 64 4KB randread in parallel. I am looking for a realistic situation that can push the device that hard. – Mohammad Hedayati May 06 '16 at 15:47