4

I have a file with a list of keywords I would like to locate (pattern.txt):

foo
foo_bar
asdf
asdf_fdsa

Some of these keywords are substrings of others, so I am using the -w option for grep to match full words.

For this example, I am just using a copy of the pattern file for the data to search (data.txt).


When I run grep -wf pattern.txt data.txt I would expect all patterns to be found, but the result is only the two smaller patterns:

foo
asdf

However, if I re-order the pattern file to list the long-words before short-words:

foo_bar
foo
asdf_fdsa
asdf

grep -wf pattern.txt data.txt will return all four matches. What gives? Why does the ordering of the pattern file change the output here?

After research, I can tell that -f is shorthand for writing grep -e ... -e ... etc. , and can confirm that this behavior is reflected when written in this form, but I cannot find any info about this order-dependent behavior. Thanks for any insight.

Edit: on macOS with BSD grep 2.5.1-FreeBSD

Community
  • 1
  • 1
  • 1
    I can confirm this on macOS with BSD grep 2.5.1-FreeBSD: `printf '%s\n' 'foo' 'food' | grep -w -e foo -e food` shows only `foo`, while GNU grep shows both `foo` and `food` as expected. The same behavior can be seen with `-x`, and according to my reading of [POSIX grep](https://pubs.opengroup.org/onlinepubs/9699919799/utilities/grep.html), it should match both. – that other guy May 04 '20 at 20:13
  • Thanks @thatotherguy for confirming. I am guessing it is a bug then if it differs from GNU behavior. – Robyn Murdock May 04 '20 at 20:31
  • OpenBSD 6.6 grep returns all 4 lines as expected, fwiw. (So does NetBSD 9, but that uses GNU grep, not its own implementation) – Shawn May 04 '20 at 23:34

1 Answers1

1

This does appear resolved on macOS Monterey (12.5.1) and FreeBSD-13.1.

Monterey:

% cat ~/Downloads/food.txt
foo
foo_bar
asdf
asdf_fdsa
% grep -wf ~/Downloads/food.txt ~/Downloads/food.txt
foo
foo_bar
asdf
asdf_fdsa
% grep -w -e foo -e foo_bar -e asdf -e asdf_fdsa ~/Downloads/food.txt
foo
foo_bar
asdf
asdf_fdsa
% printf '%s\n' 'foo' 'food'  | grep -w -e foo -e food
foo
food

FreeBSD:

% cat /tmp/food.txt
foo
foo_bar
asdf
asdf_fdsa
% grep -wf /tmp/food.txt /tmp/food.txt
foo
foo_bar
asdf
asdf_fdsa
% grep -w -e foo -e foo_bar -e asdf -e asdf_fdsa /tmp/food.txt
foo
foo_bar
asdf
asdf_fdsa
% printf '%s\n' 'foo' 'food'  | grep -w -e foo -e food
foo
food

I can only guess it got resolved in FreeBSD and macOS.

James Risner
  • 5,451
  • 11
  • 25
  • 47