0

I have a df like this:

VisitID | Item |
1       | A    |
1       | B    |
1       | C    |
1       | D    |
2       | A    |
2       | D    |
2       | B    |
3       | B    |
3       | C    |
4       | D    |
4       | C    |

In R, how do I filter for VisitIDs as long as they contain Item A & B? Expected Outcome:

VisitID | Item |
1       | A    |
1       | B    |
1       | C    |
1       | D    |
2       | A    |
2       | D    |
2       | B    |

I tried df %>% group_by(VisitID) %>% filter(any(Item == 'A' & Item == 'B')) but it doesn't work..

df <- read_delim("ID | Item 
1  | A    
1  | B    
2  | A    
3  | B    
1  | C    
4  | C    
5  | B    
3  | A    
4  | A    
5  | D", delim = "|", trim_ws = TRUE)
MrFlick
  • 195,160
  • 17
  • 277
  • 295
spidermarn
  • 959
  • 1
  • 10
  • 18
  • 1
    Try `filter(any(Item == 'A') & any(Item == 'B'))` It's not possible for an Item to be both A and B at the same time – MrFlick Jun 02 '20 at 05:57

2 Answers2

3

Since you want both "A" and "B" you can use all

library(dplyr)
df %>% group_by(VisitID) %>% filter(all(c("A", "B") %in% Item))

#  VisitID Item 
#    <int> <chr>
#1       1 A    
#2       1 B    
#3       1 C    
#4       1 D    
#5       2 A    
#6       2 D    
#7       2 B    

OR if you want to use any use them separately.

df %>% group_by(VisitID) %>% filter(any(Item == 'A') && any(Item == 'B'))
MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
0

An otion with data.table

library(data.table)
setDT(df)[, .SD[all(c("A", "B") %in% Item)], VisitID]
akrun
  • 874,273
  • 37
  • 540
  • 662