0

I am trying to write what I thought would be a simple regex pattern, but it turned out to be unexpectedly complicated.

I am trying to detect if:

  1. Two alternating words are not used in turns in a sentence:
  • do detect "Cat cat."
  • do not detect "Cat dog."
  1. There can be one or more other words between these words:
  • do detect "The cat chased another cat."
  • do not detect "The cat chased another dog."
  1. The words can be present more than one time in the sentence:
  • do detect: "The cat chased the dog after the cat had chased another cat."
  • do not detect: "The cat chased the dog after the cat had chased another dog."
  1. The sentence may include punctuation:
  • do detect: "The cat chased the cat, and another cat chased, well – another dog."
  • do detect: "The cat chased the dog, and another cat chased, well – another dog."

I'm so far with (in Autohotkey):

regex := "^(?:(?:(cat\b.*?(?<!cat)\bdog)|(dog\b.*?(?<!dog)\bcat))+|(?:cat|dog)\b.*?(?:cat|dog)\b)$"
string := "The cat chased the cat, and another cat chased, well – another dog."
if (string ~= /regex/i) {
    MsgBox, in turns
} else {
    MsgBox, not in turns
}

But it does not work, and I'm stuck.

Albina
  • 1,901
  • 3
  • 7
  • 19
LeFunk
  • 67
  • 6

2 Answers2

2

To rephrase the problem: exclude/ignore a word between 2 words OR determine a specific word order in a sentence.

(cat(?:(?!dog).)*cat)|(dog(?:(?!cat).)*dog)

This regex works like this:

  • (cat(?:(?!dog).)*cat) finds 2 cat words and no dog word between them
  • (dog(?:(?!cat).)*dog) finds 2 dog words and no cat word between them
  • (?:(?!dog) or (?:(?!cat) simply excludes cat or dog as a non-capturing group

regex101.com

"Antipattern" (whole negation, finds only correct sentences):

^((?!((cat(?:(?!dog).)*cat)|(dog(?:(?!cat).)*dog))).)*$

regex101.com

Albina
  • 1,901
  • 3
  • 7
  • 19
1

Should be a piece of cake with the use of a regex backreference. So you could do something like:

/(\b\w+\b).*\b\1\b/

This regex will match, if a word repeats itself in a string. You can play it with online.

Daksh
  • 449
  • 5
  • 1
    You'll need word boundaries around the back-reference too, otherwise it will match "I'm a cat not a caterpillar" – Bohemian Dec 25 '22 at 02:31
  • Thank you! But I'm not checking whether any random word repeats in a string, but whether a specific word (either "cat" or "dog") repeats without the other specific word (if it was "cat" then "dog"; if it was "dog" then "cat") somewhere before it. – LeFunk Dec 25 '22 at 07:06