17

The function signature for write(2) is ssize_t write(int fd, const void *buf, size_t count). Generally, the maximum value of size_t is greater than that of ssize_t. Does this mean the amount of data that write can actually write is actually SSIZE_MAX instead of SIZE_MAX? If that is not the case, what happens when the number of bytes written is greater than SSIZE_MAX with respect to overflows?

I am essentially wondering if that amount of data written by write is bounded by SSIZE_MAX or SIZE_MAX.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Eric Pruitt
  • 1,825
  • 3
  • 21
  • 34
  • Processes also have limits. A process that exceeds the `process's file size limit` will fail ( error `EFBIG`), thus that limit might be < than `SSIZE_T_MAX` though I'm not sure about this. – yeyo Apr 18 '15 at 21:34
  • SSIZE_T stands for Signed Size and so meaning its a signed number. Thats the case when some error occurred during write system call. So the maximum number of writes can not be greater signed Size_t – madz Apr 18 '15 at 21:35
  • pendrive, I know as much. That doesn't address my question at all. The return value is `ssize_t` which has a more limited range than `size_t` on most systems, so I am essentially asking if writes are bounded by `SSIZE_MAX` or `SIZE_MAX`. – Eric Pruitt Apr 18 '15 at 21:49
  • 2
    Note that objects larger than `PTRDIFF_MAX` (usually equal to `SIZE_MAX/2` and `SSIZE_MAX`) should not be able to arise on a high-quality implementation, If they do, pointer subtraction is unsafe (can overflow and produce UB). So there should never be a valid size you could pass to `write` larger than `SSIZE_MAX`. – R.. GitHub STOP HELPING ICE Apr 18 '15 at 23:35
  • `PTRDIFF_MAX` is different from `SIZE_MAX/2`. Consider `x86-16` with the large memory model. Pointers are far (32-bit), but individual objects are limited to one segment (so `size_t` is allowed to be 16-bit). [ref](https://stackoverflow.com/a/8649077/136285). – malat Dec 15 '22 at 08:25

3 Answers3

16

The type ssize_t is defined by POSIX as a signed type to be capable of storing at least 32767 (_POSIX_SSIZE_MAX) with no other guarantees. So its maximum value can be less than the maximum value of size_t.

ssize_t's POSIX definition:

ssize_t

Used for a count of bytes or an error indication.

So it's possible the number of bytes you requested to be written can be greater than what ssize_t can hold. In that case, POSIX leaves it to the implementation.

From write()'s POSIX spec:

ssize_t write(int fildes, const void *buf, size_t nbyte);

If the value of nbyte is greater than {SSIZE_MAX}, the result is implementation-defined.

alk
  • 69,737
  • 10
  • 105
  • 255
P.P
  • 117,907
  • 20
  • 175
  • 238
  • I know its wandering a bit, but if `-ansi` is used, then `SSIZE_MAX` and `_POSIX_SSIZE_MAX` may not be defined on GNU systems. Also see [SSIZE_MAX on ia64](https://sourceware.org/ml/libc-hacker/2002-08/msg00031.html). The discussion quickly expanded to include all glibc systems. – jww Mar 25 '16 at 12:19
  • Linux seems to return `EINVAL` if `nbyte > SSIZE_MAX+1`. (It does accept `SSIZE_MAX+1`, returning `0`) It's unfortunate this hasn't been simply handled in the standard by mandating a partial write of at most `SSIZE_MAX` bytes. – Petr Skocik Dec 05 '17 at 18:42
15

The POSIX specification for write() says:

If the value of nbyte is greater than {SSIZE_MAX}, the result is implementation-defined.

So any attempt to write more than SSIZE_MAX bytes leads to behaviour that is not mandated by POSIX, but that must be documented by the system (it is implementation-defined, not undefined, behaviour). However, different systems may handle it differently, and there's nothing to stop one system from reporting an error (perhaps errno set to EINVAL) and another writing SSIZE_MAX bytes and reporting that, leaving it to the application to try again on the remainder, and other systems could be inventive and do things differently still.

If you've got a 64-bit system, SSIZE_MAX is likely larger than the amount of disk space in the biggest single data centre in the world (possibly by an order of magnitude or more, even allowing for the NSA and Google), so you're unlikely to be able to run into real problems with this, but on 32-bit systems, you could easily have more than 2 GiB of space and if ssize_t is 32-bit, you have to deal with all this. (On Mac OS X 10.10.3, a 32-bit build has a 4-byte size_t and ssize_t, at least by default.)

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • *larger than the amount of disk space in the world (by a few orders of magnitude)*. Not at all! It is the other way around: it only takes 20 million 500 GB hard drives to exceed 2^63 bytes. – chqrlie Apr 18 '15 at 22:11
  • @chqrlie `9223372036854775808` bytes in gibibyte = `8589934592`. Divide that amount by 500 (500 hard drives). `17 179 869.184` 500 GiB hard drives. Though you are right, the world my have more disk space. – yeyo Apr 18 '15 at 22:15
  • @chqrlie: Hmmm..you're close enough that I've reduced the claim. And anyone trying to copy that much data would have problems with 'where is it being read from and written to' and 'are you allowed to read and write that much data all at once' — and the answer would be 'no, you are _not_ allowed to write that much data all at once'. – Jonathan Leffler Apr 18 '15 at 22:16
  • Check this article: it tries to answer this question but it is quite outdated: http://paulwallbank.com/2012/08/23/how-much-server-space-do-internet-companies-need-to-run-their-sites/ . 9 exabytes is not out of reach for google, dropbox or the NSA, but they would be well advised to spread this on several physical data centers. Yet it may still appear as one huge virtual data center. – chqrlie Apr 18 '15 at 22:20
  • 1
    2^63 is 8 EiB, or around 9E18 bytes; that is within reach, but it would be a ginormous data centre, and it is unlikely that people on SO have access to one, and even if they do have access to one, the chances of them being allowed to make a single write to transfer that much data is negligible, if only because getting enough RAM together to hold the image to be written is a problem too. But 10^50 corresponds to about 2^166, which is a lot bigger, and then you run into problems with atom counts. – Jonathan Leffler Apr 18 '15 at 22:35
  • @chqrlie: That's an interesting link. You might also be aware of XKCD's What If? #63 [Google's Datacentres on Punched Cards](http://what-if.xkcd.com/63/) from September 2013, which comes up with some interesting answers too. – Jonathan Leffler Apr 19 '15 at 01:01
  • With virtual memory, it is not inconceivable to mmap north of 2^63 bytes of core memory that won't get physically mapped immediately. With copy on write, they may even get mapped in core if left blank. Now imagine wiping big brother's entire archive with a single write, as super user, to /dev/onebigvirtualclusterdd... – chqrlie Apr 19 '15 at 07:37
3

Yes, the amount of data that can be written in a single call to write is limited to what can be held in a ssize_t. For clarification, see the relevant glibc documentation page. To quote that page, "Your program should always call write in a loop, iterating until all the data is written." (emphasis added) That page also clarifies that ssize_t is used to represent the size of blocks that can be read or written in a single operation.

David Roundy
  • 1,706
  • 2
  • 14
  • 20
  • Do you know if it is mentioned somewhere in the POSIX spec that the amount of data written by `write` is bounded by `ssize_t` instead of `size_t`? – Eric Pruitt Apr 18 '15 at 21:50
  • I'm not sure of that, I'm unfamiliar with the POSIX spec. – David Roundy Apr 18 '15 at 21:56
  • 4
    Actually, I think the answer is that the behavior is implementation-defined: [the page on write](http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html) says that "If the value of nbyte is greater than {SSIZE_MAX}, the result is implementation-defined." This seems silly to me: why not just define write to do something reasonable: i.e. write only what can be done safely? – David Roundy Apr 18 '15 at 21:58