22

i would like to know difference between below 2 commands, I understand that 2) should be use but i want to know the exact sequence that happens in 1) and 2) suppose filename has 200 characters in it

1) cat filename | grep regex

2) grep regex filename

unixadmin007
  • 221
  • 1
  • 2
  • 3

5 Answers5

26

Functionally (in terms of output), those two are the same. The first one actually creates a separate process cat which simply send the contents of the file to standard output, which shows up on the standard input of the grep, because the shell has connected the two with a pipe.

In that sense grep regex <filename is also equivalent but with one less process.

Where you'll start seeing the difference is in variants when the extra information (the file names) is used by grep, such as with:

grep -n regex filename1 filename2

The difference between that and:

cat filename1 filename2 | grep -n regex

is that the former knows about the individual files whereas the latter sees it as one file (with no name).

While the former may give you:

filename1:7:line with regex in 10-line file
filename2:2:another regex line

the latter will be more like:

7:line with regex in 10-line file
12:another regex line

Another executable that acts differently if it knows the file names is wc, the word counter programs:

$ cat qq.in
1
2
3

$ wc -l qq.in           # knows file so prints it
3 qq.in

$ cat qq.in | wc -l     # does not know file
3

$ wc -l <qq.in          # also does not know file
3
paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
6

First one:

cat filename | grep regex

Normally cat opens file and prints its contents line by line to stdout. But here it outputs its content to pipe'|'. After that grep reads from pipe(it takes pipe as stdin) then if matches regex prints line to stdout. But here there is a detail grep is opened in new shell process so pipe forwards its input as output to new shell process.

Second one:

grep regex filename

Here grep directly reads from file(above it was reading from pipe) and matches regex if matched prints line to stdout.

denizeren
  • 934
  • 8
  • 20
  • 1
    +1: A pedant (e.g. me) might argue that `cat` always writes to its standard output, but in the context of the pipe, its standard output is the write end of a pipe. Similarly, when `grep` is invoked with no file name arguments, or when it processes a filename argument of `-`, it will read its standard input, which in this case, is the read end of the pipe. Note that `pipe` or `|` is not a command; it isn't quite clear whether you recognize that with 'so pipe forwards its input as output to new shell process'. – Jonathan Leffler Nov 22 '12 at 07:59
4

If you want to check the actual execution time diffrence, first create a file with 100000 lines:

user@server ~ $ for i in $(seq 1 100000); do echo line${1} >> test_f; done
user@server ~ $ wc -l test_f
100000 test_f

Now measure:

user@server ~ $ time grep line test_f
#...
real    0m1.320s
user    0m0.101s
sys     0m0.122s

user@server ~ $  time cat test_f | grep line
#... 
real    0m1.288s
user    0m0.132s
sys     0m0.108s

As we can see, the diffrence is not too big...

dstronczak
  • 2,406
  • 4
  • 28
  • 41
  • 1
    Does the second `time` command time the `cat` or the whole pipeline? – Jonathan Leffler Nov 22 '12 at 08:01
  • 4
    How much of the time you observed was due to the omitted output being written to screen? I tried with the output of `grep` redirected to `/dev/null` and got times of in the 10-50 ms range, not 1 second range. Now, my machine is no slouch, but 20 times as fast as yours seems unlikely (even allowing that the file is probably mostly in memory, not on disk). It is very hard to do good benchmarking. What I fear you are measuring is the time taken to write 100,000 lines to your terminal, rather than the raw performance of `grep` vs `cat | grep`. – Jonathan Leffler Nov 22 '12 at 08:09
  • 2
    For anyone who may be curious about the results with the above feedback, I just reran these using the same file as above, but with these timing commands: `time grep line test_f > /dev/null` and `time (cat test_f | grep line > /dev/null)`. The times are more in line with @JonathanLeffler 's results, but both commands still are almost the exact same speed. – bkribbs Mar 29 '17 at 06:24
1

Actually, though the outputs are the same;

-$cat filename | grep regex

This command looks for the content of the file "filename", then fetches regex in it; while

-$grep regex filename

This command directly searches for the content named regex in the file "filename"

Nji
  • 11
  • 2
0

Functionally they are equivalent, however, the shell will fork two processes for cat filename | grep regex and connect them with a pipe.

iabdalkader
  • 17,009
  • 4
  • 47
  • 74