1

I'm currently working through Kernighan and Pike's excellent book The UNIX Programming Environment. One interesting example they give (Exercise 3-7) is the command

cat x y > y

I tried running this command with the initial contents of the files being x contains xxx and y contains yyy:

The command does not complete. x remains unchanged (as you would expect) and y ends up with an large number of lines of xxx.

This is how I've rationalised the behaviour:

  1. First thing the > redirection operator does is truncate y ready to receive the redirect data. Hence no yyy bytes end up in y.

  2. As y is empty when the redirect process starts writing data., the output of cat x y (initially) is just xxx.

  3. cat will not stop writing until it reaches EOF on y. But it never reaches EOF, because as each write operation completes, it pushes EOF further past the current read/write pointers? So cat keeps appending y to itself indefinitely.

If anyone could provide a more articulate (and less handwavy) explanation of this behaviour, or correct me if this is total rubbish, that would be much appreciated.

Also I'm sure at one point I found a set of worked solutions online for the problems in this textbook, but I can't find them now. If anyone could point me to such a thing, that would be great.

Many thanks,

MB

Monkeybrain
  • 766
  • 5
  • 23
  • you overwrote the contents of file y. – t0mm13b Oct 23 '15 at 20:47
  • 1
    which version of `cat` are you using? gnu cat 8.21 dumps x's contents into `y`, then aborts with `cat: y: input file is output file` – Marc B Oct 23 '15 at 20:49
  • using cat 8.23. when you stated in 2nd point, y is empty, it is overwritten with contents of x, which in turn yields the message `$ cat x y > y cat: y: input file is output file' – t0mm13b Oct 23 '15 at 20:51
  • @Marc B - I'm using OSX cat. @Charles Duffy - I will try learning about `strace` - sounds useful :-) ... *looks into it* ... uh oh, `strace` isn't available for OSX ... I'll have to research an equivalent. Thanks for the suggestion. – Monkeybrain Oct 23 '15 at 20:58
  • See if you have `truss` or `dtruss`. – Charles Duffy Oct 23 '15 at 21:12

1 Answers1

1

Yes, your existing understanding is correct.

To look at the actual syscalls (from strace busybox cat x y >y, to avoid the GNU version's attempts at detecting and terminating such loops):

open("x", O_RDONLY)                     = 3
read(3, "xxx\n", 4096)                  = 4
write(1, "xxx\n", 4)                    = 4
read(3, "", 4096)                       = 0
close(3)                                = 0
open("y", O_RDONLY)                     = 3
read(3, "xxx\n", 4096)                  = 4
write(1, "xxx\n", 4)                    = 4
read(3, "xxx\n", 4096)                  = 4
write(1, "xxx\n", 4)                    = 4

...and the final two lines repeat ad infinitum.

Not shown is the outer shell's open("y", O_WRONLY|O_CREAT|O_TRUNC, 0666), done after the fork() to spawn the subprocess which then exec's cat but before cat's actual execution -- and thus, before all of the above.

Thus, we see exactly what you posited: Only a single line of xxxs is available for read until cat performs a write, at which point additional content is immediately available for read, so the loop proceeds.

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441