4

I can't understand why the regexp:

[^\d\s\w,]

Matches the string:

"leonardo,davinci"

That is my test:

$ echo "leonardo,davinci" | egrep '[^\d\w\s,]'
leonardo,davinci

While this works as expected:

$ echo "leonardo,davinci" | egrep '[\S\W\D]'
$ 

Thanks very much

ndnenkov
  • 35,425
  • 9
  • 72
  • 104
Luca
  • 801
  • 1
  • 6
  • 11
  • 1
    @blueygh2 the whole thing is negated. It may be something to do with egrep, never used it. – Josep Valls Aug 28 '15 at 19:04
  • 1
    @blueygh2 Normally `^` negates everything in the list between brackets. I don't see how that regex could possibly match that string though. As far as I see you are only matching one character, which is not a digit, spatial character or word character. – Bram Vanroy Aug 28 '15 at 19:04
  • 1
    Side note, `[\d\w]` is redundant: `\w = [a-zA-Z0-9_]`. – Sam Aug 28 '15 at 19:07
  • With [grep -P](http://www.gnu.org/software/grep/manual/grep.html#grep-Programs): *Interpret the pattern as a Perl regular expression* – Jonny 5 Aug 29 '15 at 04:53

1 Answers1

9

It's because egrep doesn't have the predefined sets \d, \w, \s. Therefore, putting slash in front of them is just matching them literally:

leonardo,davinci

echo "leonardo,davinci" | egrep '[^a-zA-Z0-9 ,]'

Will indeed, not match.


If you have it installed, you can use pcregrep instead:
echo "leonardo,davinci" | pcregrep '[^\w\s,]'
Sam
  • 20,096
  • 2
  • 45
  • 71
ndnenkov
  • 35,425
  • 9
  • 72
  • 104
  • Thank you very much! The same problem there is with the tab? It doesn't recognize [^\t] as "all except tab", but as "all except t"... – Luca Aug 28 '15 at 21:49
  • @Nopaste, right. It fails for `\t` too: `echo -e 'foo\tbar' | egrep '\t'` yields no matches. – ndnenkov Aug 28 '15 at 22:03