2

I am trying to filter a column which contains several keywords (in this example dog and cat) but I am having problems as only the first element is being used.

id <- c(1:7)
type <- c("dog1","dog2" ,"cat1","cat2","zebra1", "parrot5", "elephant15")
filter1 <- c("dog","cat")
df1 <- data.frame(id,type)
dfilter <- df1[grep(filter1,df1$type),]
dfilter

I would be grateful for your help.

Jaap
  • 81,064
  • 34
  • 182
  • 193
adam.888
  • 7,686
  • 17
  • 70
  • 105

5 Answers5

3

grep can use | as an or, so why not paste your filters together with | as a separator:

dfilter <- df1[grep(paste0(filter1, collapse = "|"), df1$type),]
jeremycg
  • 24,657
  • 5
  • 63
  • 74
2

Try this:

dfilter <- df1[sapply(filter1, function(x) grep(x,df1$type)),]

It's complaining because your filter is a vector and grep wants a string.

Edit:

From this answer:

dfilter <- df1[df1$type %in% grep(paste(filter1, collapse="|"), df1$type, value=TRUE), ]
Community
  • 1
  • 1
TayTay
  • 6,882
  • 4
  • 44
  • 65
2

Like @Tgsmith61591 mentioned, the pattern argument for the grep function requires a string. Since you're passing in a vector it's warning you that it will only process the first element.

Another solution would be something like this:

dfilter <- unique(grep(paste(filter1, collapse = "|"), df1$type, value=TRUE))

See this post grep using a character vector with multiple patterns

Community
  • 1
  • 1
1
df1[(gsub('\\d','',df1$type) %in% filter1),]
  id type
1  1 dog1
2  2 dog2
3  3 cat1
4  4 cat2
Shenglin Chen
  • 4,504
  • 11
  • 11
1

Here is a dplyr method:

library(stringi)
library(dplyr)

data = data_frame(
  id = c(1:7),
  type = c("dog1","dog2" ,"cat1","cat2","zebra1", "parrot5", "elephant15")
)


data %>%
  filter(animals %>%
           paste(collapse = "|") %>%
           stri_detect_regex(type, . ) )
bramtayl
  • 4,004
  • 2
  • 11
  • 18