4

I have a data frame like this.

df
    Tour    Order   Machine    Company
[1]    A        D         D          B
[2]    B        B         A          G
[3]    A        E         B          A
[4]    C        B         C          B
[5]    A        G         G          C

I want to get the rows where the three columns Tour, Order Machine contains at least one D E or G.

The result should be:

    Tour    Order   Machine    Company
[1]    A        D         D          B
[3]    A        E         B          A
[5]    A        G         G          C

My attempt:

df %>%
    filter(any(c(Tour, Order, Machine) %in% c('D', 'E', 'G')))

But it doesn't filter correctly(all the rows are returned). Could anybody please help me?

zx8754
  • 52,746
  • 12
  • 114
  • 209
Makoto Miyazaki
  • 1,743
  • 2
  • 23
  • 39

5 Answers5

11

Another tidyverse approach using filter_at

df %>% filter_at(vars(-Company), any_vars(. %in% c("D", "E", "G")))
#  Tour Order Machine Company
#1    A     D       D       B
#2    A     E       B       A
#3    A     G       G       C

Update for dplyr >= 1.0

filter_at and any_vars have been superseded by if_any allowing for the more succinct

df %>% filter(if_any(-Company, `%in%`, c("D", "E", "G")))
Maurits Evers
  • 49,617
  • 4
  • 47
  • 68
4

Another option:

df[rowSums(sapply(df[-4], '%in%', c('D', 'E', 'G'))) > 0,]

The resut:

  Tour Order Machine Company
1    A     D       D       B
3    A     E       B       A
5    A     G       G       C

With dplyr you should add rowwise():

df %>%
  rowwise() %>% 
  filter(any(c(Tour, Order, Machine) %in% c('D', 'E', 'G')))
h3rm4n
  • 4,126
  • 15
  • 21
1
ind <- apply(sapply(df1[c("Tour","Order","Machine")],`%in%`,c('D', 'E', 'G')),1,any)
df1[ind,]
#   Tour Order Machine Company
# 1    A     D       D       B
# 3    A     E       B       A
# 5    A     G       G       C
  • sapply will return a matrix of Booleans containing the match for each cell.
  • apply will check if any of them is TRUE, which means you want to keep the row
  • We filter input

A dplyr version:

df1 %>%
  filter_at(c("Tour","Order","Machine"),any_vars(.%in% c('D', 'E', 'G')))
#   Tour Order Machine Company
# 1    A     D       D       B
# 2    A     E       B       A
# 3    A     G       G       C

data

df1 <- read.table(header=TRUE,stringsAsFactors=FALSE,text="
 Tour    Order   Machine    Company
    A        D         D          B
    B        B         A          G
    A        E         B          A
    C        B         C          B
    A        G         G          C")
moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
1

You can lapply over the columns to check for matches, then Reduce using | (or) to select if there are any matches.

df[Reduce('|', lapply(df[-4], '%in%', c('D', 'E', 'G'))),]
IceCreamToucan
  • 28,083
  • 2
  • 22
  • 38
1

using base R:

df1[grepl("[DEG]",do.call(paste,df1[-4])),]# YOU CAN USE "D|E|G"

  Tour Order Machine Company
1    A     D       D       B
3    A     E       B       A
5    A     G       G       C
Onyambu
  • 67,392
  • 3
  • 24
  • 53