-2

I would like to detect if a string contains multiple different words and would like to limit the number of words. Words all kinds of characters, except spaces.

E.g.: I want to check if the following strings have no more than three distinct words:

lorum                               -> True
lorum ipsum                         -> True
lorum ipsum dolor                   -> True
lorem lorem ipsum dolor ipsum ipsum -> True
lorem lorem <=>                     -> True
1 2 3                               -> True

lorem ipsum dolor sit lorum         -> False
lorem ipsum dolor sit               -> False
1 2 3 4                             -> False
AVS
  • 99
  • 2
  • 11

1 Answers1

1

To my great surprise this is actually achievable with regular expression. This is really ugly and inefficient, but it works.

You should probably not use it though: this is not the right tool for this job.

/^(\S*)(?: \1)*(?:(?: (\S*))(?: \1| \2)*(?: (\S*))?)?(?: \1| \2| \3)*$/gm

https://regex101.com/r/0cgoFF/1

Nicolas Reynis
  • 783
  • 3
  • 16