1

I thought * meant zero or more of the character or class that precedes it in basic or extended regex. Why does echo hello| grep '*llo' fail but echo hello |egrep '*llo' succeed?

2 Answers2

1

When using grep/egrep/fgrep, you can include the -o flag to cause grep to return just the characters that matched. (if you have a nice color terminal you might also try --color so that it highlights the match in the returned lines. It often helps in cases like this.

echo "that star " | grep -o '*count'
echo "that star " | egrep -o '*count'
echo "that star " | fgrep -o '*count'
echo "that star counted" | grep -o '*count'
echo "that star counted" | egrep -o '*count'  ## returns "count"
echo "that star counted" | fgrep -o '*count'
echo "that star *counted" | grep -o '*count'  ## returns "*count"
echo "that star *counted" | egrep -o '*count'  ## returns "count"
echo "that star *counted" | fgrep -o '*count'  ## returns "*count"

The ones without comments returned nothing.

So the difference is that the old grep and fgrep parsers, when they dodn't see a character or set before the star, choose to treat it as a normal character to match. egrep treats it as a no-op or invalid and silently continues.

(one more note, I sometimes use pcregrep for perl regex compatibility, and it actually throws up an error message when the regex starts with an asterisk!)

Mark
  • 2,248
  • 12
  • 15
0

http://www.regular-expressions.info/repeat.html

http://www.robelle.com/smugbook/regexpr.html

In regular expressions the asterisk is used to find pattern of the character prEceding it, not prOceding it.

In other words you should say echo hello | grep 'llo*' to find 'llo' or 'lloooo' etc. to match more letters in a pattern, enclose it with parenthesis. (llo)* would find llo, llollo etc.

I'm guessing grep with a * fails because it's not a valid regular expression while egrep just ignores the *.

Snowburnt
  • 775
  • 2
  • 5
  • 18
  • 1
    Your example would actually find 'll', 'llo', 'lloo', 'llooo' and so forth. The `*` only grabs 1 character unless it is preceded by a group or set. – Mark Apr 08 '13 at 17:31
  • I would think that '*llo' would basically be passing a null value to the * operator, but shouldn't it be ignored if there is no match anyways since it's a '0' or more operator? Furthermore, how do we determine whether the * is an operator for basic regex versus being a glob/wildcard character? The shell is compiled with glob libraries which allow it to expand * to match certain things, but is grep compiled with that library? –  Apr 08 '13 at 17:33
  • @Mark, you're very right. I corrected my answer to reflect that. – Snowburnt Apr 08 '13 at 18:40
  • @GreggLeventhal If it is part of the regular expression it looks like grep ignores it while egrep flips out. If it is part of the file input you can use * as a glob wildcard. – Snowburnt Apr 08 '13 at 18:46