8

Is it possible for read to

  • block
  • return less data than requested

when reading from a regular file, excluding:

  • request more data than SSIZE_MAX
  • reading beyond EOF
  • signal interupt

read(3) suggests that, excluding the above conditions, when reading from a regular file read will never return fewer bytes than requested.

The value returned may be less than nbyte if the number of bytes left in the file is less than nbyte, if the read() request was interrupted by a signal, or if the file is a pipe or FIFO or special file and has fewer than nbyte bytes immediately available for reading.

However, this answer suggests a hypothetical in which read may return fewer bytes than requested if the kernel wishes to prioritize other I/O. While a hypothetical, the point is that under no conditions can read be expected return exactly as much data as requested. So it is never safe, even if the above three conditions (SSIZE_MAX, EOF, interrupt) do not apply, to use read on a regular file without checking the return value:

// all blockable signals have been ignored
// 10 is guaranteed less than SSIZE_MAX
// file size is known, access is locked
if (read(fd_of_big_reg_file_with_zero_offset, buf, 10) < 0) {
    // so all we have to do is handle errors
}

Furthermore, I have never experienced a read on a regular file to block, but I assume it is possible in the event of a recoverable I/O error, such as a bad block requiring multiple rereads.

Community
  • 1
  • 1
user19087
  • 1,899
  • 1
  • 16
  • 21
  • I don't know of a situation when read would return less data (other than the ones put up above). – Ani Menon May 05 '16 at 04:27
  • 3
    There are NO cases where the return value of `read` should not be checked. Also, you have it backwards w.r.t. blocking. Read on regular file ALWAYS blocks, i.e. there's no "can't read now, try again later" situation which is characteristic of non-blocking IO. – n. m. could be an AI May 05 '16 at 05:09
  • 1
    Also, note that "ignoring all blockable signals" doesn't mean the system call can't be interrupted by a signal. SIGSTOP can't be ignored or blocked, and SIGCONT afterward would make the process continue, even if it's ignored or blocked. And depending on signal semantics in effect, that could make `read` return with a short count. So nothing you can do within your program can eliminate that possibility. – Nate Eldredge May 05 '16 at 15:07

1 Answers1

7

One way to get a short read (in addition to the cases mentioned in your question) is if an I/O error occurs in the middle of a read.

Imagine for example that you have a regular file of size 1024, occupying two 512-byte sectors. Unbeknown to you, the second sector is bad and cannot be read. Opening the file and doing read(fd, buf, 1024) will return 512 and will not set errno. If you try to read again, you get a return value of -1 and errno = EIO.

I was able to test this on Linux using the device mapper's error function.

Since there isn't anything your program can do to rule out the possibility of I/O errors, this implies that it is never safe to assume that any positive return value from read must mean you read as many bytes as requested.

Nate Eldredge
  • 48,811
  • 6
  • 54
  • 82
  • It's also possible albeit implementation-dependent to get a short read if the `read()` call is interrupted by a signal after just some of the data requested has been transferred. – Andrew Henle May 05 '16 at 10:13
  • @AndrewHenle: Well, I wasn't claiming this is the *only* way to get a short read. Edited to clarify. – Nate Eldredge May 05 '16 at 16:16