Regular expression to count number of pattern in a string

Question

I got the following regular expression from Regular expression to count number of commas in a string.

/^([^,]*,){21}[^,]*$/

Is the top rated solution (https://stackoverflow.com/a/863137/3787418) for matching 21 commas.

How can I modify that regular expression to match 21 times 'hello world' instead of a single character?

I recommend a normal search than regex for the job. – Jithin Pavithran Feb 02 '17 at 16:17 — Jithin Pavithran, Feb 02 '17 at 16:17

Aaron · Accepted Answer · 2020-12-30T18:56:22.013

1

regex really isn't the tool for that, but here you go :

^(?:(?:[^h]|h(?!ello world))*hello world){21}(?:[^h]|h(?!ello world))*$

This will only work in regex flavors which support negative lookahead.

It works in the same way than the regex you've found : in a group repeated 21 times, we match "what isn't 'hello world'", followed by one occurrence of "hello world". The difficulty is in matching "what isn't 'hello world'", which I have defined as follows :

any character that isn't h ([^h])
or h if it isn't followed by ello world (h(?!ello world))

Of course any sane person would choose to use a plain text search on the string instead.

edited Dec 30 '20 at 18:56

answered Feb 02 '17 at 16:06

Aaron

24,009
2
33
57

Since you are advising against using regular expressions for that use case, how would you remove all lines from a file which contain 21 times 'hello world' ? – knowname Feb 06 '17 at 12:44
@knowname under which kind of environment? node.js? – Aaron Feb 06 '17 at 13:16
1

For example under a `bash` environment with `GNU Tools` : `while read line; do [ $(echo $line | grep -Eo "\bhello world\b" | wc -l) -eq 21 ] || echo $line ; done < source > target` – Aaron Feb 06 '17 at 13:46

Regular expression to count number of pattern in a string

1 Answers1