3

I have a vector of strings, say:

vect<-c("oxidor magnesio","oxido magnesio","oxido calcio", "oxidante","oxido calcio magnesio","magnesio oxido")

I'd like to find the occurences of both words, "oxido" and "magnesio". What I'm doing is

intersect(grep("\\boxido\\b",vect),grep("\\bmagnesio\\b",vect))

But,

  1. Question 1: is there any direct grep command to achieve it?
  2. Question 2: suppose I want to find occurrences of both words, but in a given order (say, for instance, "oxido" followed by "magnesio", so the correct answer would be 2 and 5). What would be the command?

Thanks,

Cyrus
  • 84,225
  • 14
  • 89
  • 153
PavoDive
  • 6,322
  • 2
  • 29
  • 55

1 Answers1

5

Edit. Answer 1: I know that grepl is capable of that, e.g.:

> grepl("(?=.*\\boxido\\b)(?=.*\\bmagnesio\\b)", vect, perl = TRUE)
[1] FALSE  TRUE FALSE FALSE  TRUE  TRUE

Answer 2:

> grep("\\boxido\\b.*\\bmagnesio\\b",vect,v=T)
[1] "oxido magnesio"        "oxido calcio magnesio"
Alexey Ferapontov
  • 5,029
  • 4
  • 22
  • 39