22

I'm trying to use egrep with a regex pattern to match whitespace.

I've used RegEx with Perl and C# before and they both support the pattern \s to search for whitespace. egrep (or at least the version I'm using) does not seem to support this pattern.

In a few articles online I've come across a shorthand [[:space:]], but this does not seem to work. Any help is appreciated.

Using: SunOS 5.10

thecoshman
  • 8,394
  • 8
  • 55
  • 77
user32474
  • 337
  • 3
  • 4
  • 8

5 Answers5

25

I see the same issue on SunOS 5.10. /usr/bin/egrep does not support extended regular expressions.

Try using /usr/xpg4/bin/egrep:

$ echo 'this line has whitespace
thislinedoesnthave' | /usr/xpg4/bin/egrep '[[:space:]]'
this line has whitespace

Another option might be to just use perl:

$ echo 'this line has whitespace
thislinedoesnthave' | perl -ne 'chomp;print "$_\n" if /[[:space:]]/'
this line has whitespace
Jon 'links in bio' Ericson
  • 20,880
  • 12
  • 98
  • 148
14

If you're using 'degraded' versions of grep (I quote the term because most UNIX'es I work on still use the original REs, not those fancy ones with "\s" or "[[:space:]]" :-), you can just revert to the lowest form of RE.

For example, if :space: is defined as spaces and tabs, just use:

egrep '[ ^I]' file

That ^I is an actual tab character, not the two characters ^ and I.

This is assuming :space: is defined as tabs and spaces, otherwise adjust the choices within the [] characters.

The advantage of using degraded REs is that they should work on all platforms (at least for ASCII; Unicode or non-English languages may have different rules but I rarely find a need).

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • A nice solution. I like the idea of working to the lowest common denominator – thecoshman Jun 13 '12 at 10:18
  • [GNU grep 2.5.4 on CMD](http://gnuwin32.sourceforge.net/packages/grep.htm) is happy with `[ \t]+` to match one or more whitespace chars. – handle Aug 09 '16 at 13:09
3

If you are using bash, then syntax to put a tab in a line is

$'foo\tbar'

I was recently working with sed to do some fixups on a tab-delimited file. Part of the file was:

sed -E -e $'s/\t--QUOTE--/\t"/g'

That argument is parsed by bash, and sed sees a regex with literal tabs.

PaulMurrayCbr
  • 1,167
  • 12
  • 16
0

Maybe you should protect the pattern with quotes (if bash, or anything equivalent for the shell you are using).

[ and ] may have special meaning for the shell.

Giacomo
  • 11,087
  • 5
  • 25
  • 25
-3
$ cat > file
this line has whitespace
thislinedoesnthave
$ egrep [[:space:]] file 
this line has whitespace

Works under debian.

For Solaris, isn't there an "eselect" like (see gentoo) or alternatives file to set default your egrep version?

Have you tried grep -E, because if the egrep that is on your path is not the good one, maybe grep is.

Aif
  • 11,015
  • 1
  • 30
  • 44
  • You might get some credit if you explained where 'here' was. It presumably wasn't Solaris 10. Or, if it was Solaris 10, then it probably wasn't /usr/bin/egrep that you used. – Jonathan Leffler Jan 16 '09 at 00:57