I am suffering from a regex problem in R here. I have three sentences:
s1 <- "today john jack and joe go to the beach"
s2 <- "today joe and john go to the beach"
s3 <- "today jack and joe go to the beach"
I want to know of each sentence whether john is going to the beach today, regardless of the other two guys. So the outcome for the three sentences should be (in order)
TRUE
TRUE
FALSE
I try to do this with grepl in R. The following regex gives TRUE to all sentences:
print(grepl("today (john|jack|joe|and| )+go to the beach", s1))
print(grepl("today (john|jack|joe|and| )+go to the beach", s2))
print(grepl("today (john|jack|joe|and| )+go to the beach", s3))
It helps when I sandwich "john", the compulsory word, between two identical quantifiers for the other, optional words:
print(grepl("today (jack|joe|and| )*john(jack|joe|and| )*go to the beach", s1))
print(grepl("today (jack|joe|and| )*john(jack|joe|and| )*go to the beach", s2))
print(grepl("today (jack|joe|and| )*john(jack|joe|and| )*go to the beach", s3))
However, this is obviously bad coding (repetitions). Anyone has a more elegant solution?