1

Say I have this data.frame of bad subjects:

df_bad = data.frame(id=c(1,2,3), condition=c('fun', 'boring', 'boring'))

Subject 1 is bad in the "fun" condition and subjects 2 and 3 are bad in the "boring" condition. Now I have my data:

df = data.frame(id=c(1,1,2,2,3,3), condition=rep(c('fun', 'boring'), times=3), score=rnorm(6))

How do I remove the rows of df that matches a pair of id AND condition in df_bad using tidyr? I.e., how do I end up with this data.frame:

df = data.frame(id=c(1,2,3), condition=c('boring', 'fun', 'fun'), score=df$score[c(2,3,5)])

Ideally, the solution should also work for triplets of values in df_bad.

Jonas Lindeløv
  • 5,442
  • 6
  • 31
  • 54

1 Answers1

3

We can use anti_join

library(dplyr)
anti_join(df, df_bad)
akrun
  • 874,273
  • 37
  • 540
  • 662