C implement like cat with robustness and efficiency

Question

I want to learn to implement the function like cat, which just take input from a file and print to stdout.

But I am not sure the line of write() is robust in all cases as it may write less than n. But I am not able to create a test case to make this case happen. How to make a test case so that it can result in less than n char be written? Also, how to modify the code accordingly to make the program robust (for this case, but also for other cases that I have not described)?

#include <unistd.h>
#include <stdio.h>
#include <fcntl.h>

int main(int argc, char *argv[]) {
    const char *pathname = argv[1];
    int fd;
    if((fd = open(pathname, O_RDONLY)) == -1) {
        perror("open");
        return 1;
    }
#define BUF_SIZE 1024
    char buf[BUF_SIZE];
    ssize_t n;
    while((n = read(fd, &buf, BUF_SIZE)) > 0) {
        if(write(STDOUT_FILENO, &buf, n) == -1) {
            perror("write");
            return 1;
        }
    }
    if(n == -1) {
        perror("read");
        return 1;
    }
    if(close(fd) == -1) {
        perror("close");
        return 1;
    }
    return 0;
}

EDIT: I fixed the write() bug in the previous code based on the pipe-blocking test case mentioned by Armali. Can anybody check whether there are any other bugs?

#include <unistd.h>
#include <stdio.h>
#include <fcntl.h>

int main(int argc, char *argv[]) {
    const char *pathname = argv[1];
    int fd;
    if((fd = open(pathname, O_RDONLY)) == -1) {
        perror("open");
        return 1;
    }
#define BUF_SIZE 2*65536
    char buf[BUF_SIZE];
    ssize_t r_n;
    while((r_n = read(fd, &buf, BUF_SIZE)) > 0) {
        ssize_t w_n;
        int i = 0;
        while((w_n = write(STDOUT_FILENO, buf+i, r_n)) < r_n) {
            if(w_n == -1) {
                perror("write");
                return 1;
            }
            r_n -= w_n;
            i += w_n;
        }
    }
    if(r_n == -1) {
        perror("read");
        return 1;
    }
    if(close(fd) == -1) {
        perror("close");
        return 1;
    }
    return 0;
}

I don't see any issues with your handling of the read or write. (you actually do a very good job validating each step) Why do you want to use syscalls instead of `stdio.h` I/O? — David C. Rankin, Mar 26 '21 at 04:56
`write` has [documented edge cases](https://man7.org/linux/man-pages/man2/write.2.html) where less than `n` bytes can be written. Indeed your code will fail in those cases. The only way I can imagine to test reliably is to mock write (maybe via macro). — Gene, Mar 26 '21 at 05:00
The edge cases you are attempting to emulate will not be common for a simple `cat` program for files less than 2^31 bytes of data. The partial write condition is generally associated with network writes. There you would just loop until `n` bytes had been written keeping track of the number written with each call. The other case is disk-full errors (which won't occur writing to `stdout` unless the entire filesystem if filled by something else (and then it will depend on the implementation) Your use of `read` is already limited to `BUF_SIZE` (`1024` bytes) — David C. Rankin, Mar 26 '21 at 05:25
Aside: Drop the `&` in 2 places: `while((n = read(fd, &buf, BUF_SIZE)) > 0) { if(write(STDOUT_FILENO, &buf, n) == -1) {`. `buf` is sufficient. — chux - Reinstate Monica, Mar 26 '21 at 06:05
`if(write(STDOUT_FILENO, &buf, n) == -1) {` should be `if(write(STDOUT_FILENO, &buf, n) != n) {`. — chux - Reinstate Monica, Mar 26 '21 at 06:07

score 1 · Answer 1 · answered Mar 26 '21 at 13:38

How to make a test case so that it can result in less than n char be written?

Depending on the system, this hasn't to be difficult. man 2 write tells:

… partial writes can occur for various reasons; for example, because there was
insufficient space on the disk device to write all of the requested bytes, or because
a blocked write() to a socket, pipe, or similar was interrupted by a signal …

Let's focus on pipe writes, which block if the pipe buffer is full. This example targets Linux; the numbers may well be different on other systems. man 7 pipe tells about the Pipe capacity:

Since Linux 2.6.11, the pipe capacity is 16 pages (i.e., 65,536 bytes in a system with
a page size of 4096 bytes).

To make use of this for the sake of a test case, we have to increase your BUF_SIZE to a greater value, say 1024*128. Then we can conduct the experiment:

~$ dd count=256 </dev/zero >zeros
256+0 records in
256+0 records out
131072 bytes (131 kB, 128 KiB) copied, 0.00357324 s, 36.7 MB/s
~$ strace -ewrite ./a.out zeros|(sleep 9; dd)
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072^Z
[1]+  Stopped                 strace -ewrite ./a.out zeros | ( sleep 9; dd )
~$ fg
strace -ewrite ./a.out zeros | ( sleep 9; dd )
) = 65536
--- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=482, si_uid=2001} ---
+++ exited with 0 +++
128+0 records in
128+0 records out
65536 bytes (66 kB, 64 KiB) copied, 0.00546081 s, 12.0 MB/s

First we prepare a file of 128 KiB.
Then we run the program with strace -ewrite to see the write operation; we pipe its output to (sleep 9; dd) to delay the reading. During this delay, we raise a STOP signal by pressing Ctrl-Z.
Finally, we continue the program with fg. Now we see that the write operation, which has been called with count 131072, returns only 65536 bytes written.

Also, how to modify the code accordingly to make the program robust (for this case, but also for other cases that I have not described)?

For this case and similar cases, you'd do as David C. Rankin wrote:

There you would just loop until n bytes had been written keeping track of the number written with each call.

I can't say how to make a program robust for undescribed cases other than checking the returned values from all called library functions of which you can't be sure that they succeed.

If `write()` returns 0, can I be sure that `write()` has finished writing all the input instead of a failure? — , Mar 26 '21 at 13:52
No, if _`write(fd, buf, count)` has finished writing all the input_, it returns the passed _count_, hence it could return zero only if you passed a zero _count_, which is not advisable, since the rules for this case are somewhat complicated and partially unspecified. — Armali, Mar 26 '21 at 14:36
So if the count argument of `write()` is none zero, but `write()` returns 0, then I can be sure the write is successfully finished? — , Mar 26 '21 at 15:07
I'm not sure what you mean by _finished_. If 0 is returned, this invocation of `write` is finished without error, but of course the writing of the requested _count_ bytes is not at all finished and has to be retried. Your fixed program does this right. — Armali, Mar 26 '21 at 15:16

C implement like cat with robustness and efficiency

1 Answers1