1

Example Strings:

Dandelion The animal dog is blue

The animal cat is blue

Alcohol The animal cow is blue water

I need to use a regex that will capture all instances that starts with the word 'The' and end with the word 'blue', but doesn't have the word 'cat' between these 2 words.

What I tried:

The.*?(?!cat)blue

Desired Result:

2 Matches:

The animal dog is blue

The animal cow is blue

Any help would be appreciated greatly

Basque0407
  • 59
  • 6
  • Are `The` and `blue` also allowed between `The` and `blue`? – Casimir et Hippolyte Apr 19 '18 at 20:22
  • 2
    `The(?:(?!cat).)+?blue` – ctwheels Apr 19 '18 at 20:22
  • The word `The` is allowed in between, the capture however should end at the first occurrence of the word `blue` – Basque0407 Apr 19 '18 at 20:25
  • `the word` seems common to all your description. But, I don't think you know what the _`word`_ is in your context. `The cathouse has nice blue` paint, or `They concatenate over at blue` horizon, or `The cat is not blue`. So, which is it ? –  Apr 19 '18 at 21:08
  • Does this answer your question? [Regex - Get string between two words that doesn't contain word](https://stackoverflow.com/questions/7333145/regex-get-string-between-two-words-that-doesnt-contain-word) – But those new buttons though.. Jul 18 '23 at 22:35

2 Answers2

1

You can play with the character classes \w(word characters) and \W (non-word characters) and the word-boundary \b that matches between them. To forbid words, you only have to test them at a word-boundary using a negative lookahead (?!...) (not followed by ...):

\bThe\W+(?:(?!cat\b|blue\b)\w+\W+)*blue\b

or with a perl compatible regex engine (that supports possessive quantifiers):

\bThe\W++(?:(?!cat\b|blue\b)\w+\W+)*+blue\b

This way, you are sure that cat isn't a part of scat or catering.

Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
0

".*" will match everything it can, so the "(?!cat)" portion will continue to match anything after ".*" has already matched "cat"

I would include the condition "not matching anything followed by cat" before matching "anything followed by blue" as follows:

The(?!.*cat).*blue