2

The file in this gist has two long lines.

  • When I run tail -n 1 on it, both lines are returned (I would expect just the last one).
  • When I run head -n 1 on it, only the first line is returned (as expected).
  • When I run wc -l on it, it returns 1 (I would expect 2).

If I remove one character from either the first or second line, then some things change:

  • [DIFFERENT] When I run tail -n 1 on it, only the last line is returned (as expected).
  • [SAME] When I run head -n 1 on it, only the first line is returned (as expected).
  • [SAME] When I run wc -l on it, it returns 1 (I would expect 2).

What is going on here? Why are tail and wc not behaving as I would expect on this file?

I'm on OSX 10.14.2 and a colleague was able to repro the same behavior on another machine.

nbrustein
  • 727
  • 5
  • 16
  • 1
    I'm unable to reproduce, but I imagine this could be due to a lack of proper new line characters in the file for your system. – Marcus Feb 18 '19 at 16:18
  • @Marcus I'm the colleague mentioned in the description. I was also able to reproduce the issue on my dev machine (also macOS). It isn't related to new line character encoding because a slight modification to the character count results in `tail` working properly again. – theblang Feb 18 '19 at 17:21
  • After reading the answer from @Marcus I tested the same file on my Windows box, which has Unix tools installed via Git, and the issue **does not** occur there. The version installed was `GNU coreutils 8.2.6`, which I checked with `tail --version`. – theblang Feb 19 '19 at 14:59

1 Answers1

1

After looking at the file with a hex dump tool, it looks like there is no new line at the end of the file. Interestingly the gnu coreutils handle this OK but the bsd coreutils (included with MacOS) do not. More information can be found in this stackexchange post.

Utilities that are supposed to operate on text files may not cope well with files that don't end with a newline; historical Unix utilities might ignore the text after the last newline, for example. GNU utilities have a policy of behaving decently with non-text files, and so do most other modern utilities, but you may still encounter odd behavior with files that are missing a final newline¹.

$ hexdump file-with-2-lines.txt
0000000 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61
*
0001820 61 61 61 61 61 61 61 61 61 61 61 61 0a 62 62 62
0001830 62 62 62 62 62 62 62 62 62 62 62 62 62 62 62 62
*
0003000 62
0003001

After editing the file (making no changes, just using an editor that enforces new lines at the end of the file).

$ hexdump file-with-2-lines.txt
0000000 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61
*
0001820 61 61 61 61 61 61 61 61 61 61 61 61 0a 62 62 62
0001830 62 62 62 62 62 62 62 62 62 62 62 62 62 62 62 62
*
0003000 62 0a
0003002

0a is the newline character.

Marcus
  • 3,216
  • 2
  • 22
  • 22