1

I got the following regular expression from Regular expression to count number of commas in a string.

/^([^,]*,){21}[^,]*$/ 

Is the top rated solution (https://stackoverflow.com/a/863137/3787418) for matching 21 commas.

How can I modify that regular expression to match 21 times 'hello world' instead of a single character?

Community
  • 1
  • 1
knowname
  • 51
  • 7

1 Answers1

1

regex really isn't the tool for that, but here you go :

^(?:(?:[^h]|h(?!ello world))*hello world){21}(?:[^h]|h(?!ello world))*$

This will only work in regex flavors which support negative lookahead.

It works in the same way than the regex you've found : in a group repeated 21 times, we match "what isn't 'hello world'", followed by one occurrence of "hello world". The difficulty is in matching "what isn't 'hello world'", which I have defined as follows :

  • any character that isn't h ([^h])
  • or h if it isn't followed by ello world (h(?!ello world))

Of course any sane person would choose to use a plain text search on the string instead.

Aaron
  • 24,009
  • 2
  • 33
  • 57
  • Since you are advising against using regular expressions for that use case, how would you remove all lines from a file which contain 21 times 'hello world' ? – knowname Feb 06 '17 at 12:44
  • @knowname under which kind of environment? node.js? – Aaron Feb 06 '17 at 13:16
  • 1
    For example under a `bash` environment with `GNU Tools` : `while read line; do [ $(echo $line | grep -Eo "\bhello world\b" | wc -l) -eq 21 ] || echo $line ; done < source > target` – Aaron Feb 06 '17 at 13:46