1

I have added support for AIO in my driver (the .aio_read , .aio_write calls in kernelland, libaio in userland) and looking at various sources I cannot find if in my aio_read, .aio_write calls I can just store a pointer to the iovector argument (in the assumption that this memory will remain untouched till after eg aio_complete is called), or that I need to deep copy over the iovector data structures.

static ssize_t aio_read( struct kiocb *iocb, const struct iovec *iovec, unsigned long nr_segs, loff_t pos );
static ssize_t aio_write( struct kiocb *iocb, const struct iovec *iovec, unsigned long nr_segs, loff_t pos );

Looking at the implementation of \drivers\usb\gadget\inode.c as an example, it seems they just copy the pointer in the ep_aio_rwtail function which has:

priv->iv = iv;

But when I try doing something similar it very regularly happens the data in the iovector has been "corrupted" by the time I process it.

Eg in the aio_read/write calls I log

iovector located at addr:0xbf1ebf04
segment 0: base: 0x76dbb468        len:512

But then when I do the real work in a kernel thread (after attaching to the user space mm) I logged the following:

iovector located at addr:0xbf1ebf04
segment 0: base: 0x804e00c8        len:-1088503900

This is with a very simple test case where I only submit 1 asynchronous command in my user application.

To make things more interesting: I have the corruption about 80% of the time on a 3.13 kernel.

But I never saw it before on a 3.9 kernel (but I only used it for a short while before I upgraded to 3.13, and now reverted back as a sanity cnheck and tried a dozen times or so). ( An example run with a 3.9 kernel has twice

iovector located at addr:0xbf9ee054
segment 0: base: 0x76e28468        len:512)

Does this ring any bells ?

(The other possibility is that I am corrupting these addresses/lengths myself of course, but it is strange that I never had this with a 3.9)

EDIT: To answer my own question after reviewing the 3.13 code for linux aio (which has changed significantly wrt the 3.9 that was working), in fs\aio.c you have:

static ssize_t aio_run_iocb(struct kiocb *req, unsigned opcode, char __user *buf, bool compat) { ... struct iovec inline_vec, *iovec = &inline_vec; ... ret = rw_op(req, iovec, nr_segs, req->ki_pos); ... }

So this iovec structure is just on stack, and it will be lost as soon as the aio_read/write function exits.

And the gadget framework contains a bug (at least for 3.13) in \drivers\usb\gadget\inode.c...

Bram
  • 21
  • 3

1 Answers1

0

From the man page for aio_read;

NOTES It is a good idea to zero out the control block before use. The control block must not be changed while the read operation is in progress. The buffer area being read into must not be accessed during the operation or undefined results may occur. The memory areas involved must remain valid.

Simultaneous I/O operations specifying the same aiocb structure produce undefined results.

This suggests the driver can rely on the user's data structures during the operation. It would be prudent to abandon the operation and return an asynchronous error if, during the operation, you detect those structures have changed.

wallyk
  • 56,922
  • 16
  • 83
  • 148
  • Indeed, for the control block they mention it. But the "const struct iovec *iovec argument" is not part of the control block as it is not inside "struct kiocb *iocb". And in my case I am following these rules in user space (and to make things as simple as possible in userspace I only have one control block in this simple test case for which the user thread waits till it finishes. It is a write to a single address btw so I only get 1 segment in the iovect). But I will add some checks on the iocb block as well to see if that one changes as well... – Bram Mar 06 '14 at 15:27
  • Small update: the iocb area remains uncorrupted on the 3.13 even when the iovector is corrupted. Example addresses in case it helps: 'pIOCB at addr:0xbfb91200 Contents: bf01bd80 bfba2000 0000 <...>' – Bram Mar 06 '14 at 15:47