4

I need help with a RegEx problem:

I want to find occurences of two known words ("foo" and "bar" for example), that have any white space other than EXACTLY ONE SPACE CHARACTER between them.

In the text that I have to grep, there may be spaces, tabs, CRs, LFs or any combination of them between the two words.

In RegEx words: I need one regular expression that matches "foo[ \t\n\r]+bar" but does NOT match "foo bar".

Everything I've tried so far either missed some combinations or also matched the single-space-case which is the only one that should NOT match.

Thanks in advance for any solutions.

EDIT: To clarify, I'm using Perl compatible RegEx here.

kennytm
  • 510,854
  • 105
  • 1,084
  • 1,005
selfawaresoup
  • 15,473
  • 7
  • 36
  • 47

3 Answers3

4

You could also use a negative lookahead:

foo(?! \b)\s+bar

If lookaheads are not supported you can write it explicitly:

foo(?:[^\S ]| \s)\s*bar

The expression [^\S ] includes a double negative and it might not be immediately obvious how this works. If you work it out the logic it means any whitespace apart from a space.

Mark Byers
  • 811,555
  • 193
  • 1,581
  • 1,452
1

You could use (assuming ERE, i.e. grep -E)

foo[:space:]{2,}bar

The syntax x{min,} means the pattern x must appear at least min times.


If by "other than EXACTLY ONE SPACE CHARACTER" you mean except the 0x20 space character, you need an alternation:

foo([\t\n\r]|[ \t\n\r]{2,})bar
kennytm
  • 510,854
  • 105
  • 1,084
  • 1,005
0

use [:space:]{2,}

{2,} means 2 or more

explodus
  • 11
  • 7