3

I'm trying to parse following sentences with regex (javascript) :

  • I wish a TV
  • I want some chocolate
  • I need fire

Currently I'm trying : I(\b[a-zA-Z]*\b){0,5}(TV|chocolate|fire) but it doesn't work. I also made some test with \w but no luck.

I want to allow any word (max 5 words) between "I" and the last word witch is predefined.

jaumard
  • 8,202
  • 3
  • 40
  • 63

4 Answers4

4

To account for non-word chars in-between words, you may use

/I(?:\W+\w+){0,5}\‌​W+(?:TV|chocolate|fir‌​e)/

See the regex demo

The point is that you added word boundaries, but did not account for spaces, punctuation, etc. (all the other non-word chars) between "words".

Pattern details:

  • I - matches the left delimiter
  • (?:\W+\w+){0,5}\‌​W+ - matches 0 to 5 sequences (due to the limiting quantifier {n,m}) of 1+ non-word chars (\W+) and 1+ word chars after them (\w+), and a \W+ at the end matches 1 or more non-word chars that must be present to separate the last matched word chars from the...
  • (?:TV|chocolate|fir‌​e) - matches the trailing delimiter
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
0

You need to add the whitespace after the I. Otherwise it wouldn´t capture the whole sentence.

I(\b[a-zA-Z ]*\b){0,5}(TV|chocolate|fire)

I greate site to test regex expressions is regexr

Stefan Kert
  • 587
  • 4
  • 9
0

If you don't care about the spaces, use:

/I(\s[a-zA-Z]*\s?){0,5}(TV|chocolate|fire)/

Eeko
  • 106
  • 9
0

Try

/I\s+(?:\w+\s+){0,5}(TV|chocolate|fire)/

(Test here)

Based on Stefan Kert version, but rely on right side spaces of each extra word instead of word boundaries.

It also accepts any valid "word" (\w) character words of any length and any valid spacing character (not caring for repetitions).

bitifet
  • 3,514
  • 15
  • 37