I'm trying to conditionally filter a data frame to extract the rows of interest. What I'm trying to do is different than generic conditional filtering in that it involves variable rules affecting the pairs of columns.
My reprex below simulates a data.frame
which involves 4 samples: Control
, Drug_1
, Drug_2
, and Drug_3
and pairwise comparisons among them (difference is shown as the p_value
). I'd like to use this piece of code in a function to potentially compare more than 4 groups. I tried combining the filtering criteria with OR
operators but I ended with a rather ugly code.
My end goal is obtaining a filtered_df
that shows all the rows in which variables group1
and group2
has the data pairs that is in my comparisons
list. Any help is appreciated!
Best, Atakan
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
# Make a mock data frame
gene <- "ABCD1"
group1 <- c("Control", "Control", "Control", "Drug_1", "Drug_1", "Drug_2")
group2 <- c("Drug_1", "Drug_2", "Drug_3", "Drug_2", "Drug_3", "Drug_3")
p_value <- c(0.4, 0.001, 0.003, 0.01, 0.3, 0.9)
df <- data.frame(gene, group1, group2, p_value)
df
#> gene group1 group2 p_value
#> 1 ABCD1 Control Drug_1 0.400
#> 2 ABCD1 Control Drug_2 0.001
#> 3 ABCD1 Control Drug_3 0.003
#> 4 ABCD1 Drug_1 Drug_2 0.010
#> 5 ABCD1 Drug_1 Drug_3 0.300
#> 6 ABCD1 Drug_2 Drug_3 0.900
# I'd like to filter rows when group1 and group2 matches the following pairs
comparisons <- list(c("Control", "Drug_1"), c("Control", "Drug_2"), c("Drug_2", "Drug_3"))
# I can filter by using one pair as follows:
filtered_df <- df %>%
filter(group1 == comparisons[[1]][1] & group2 == comparisons[[1]][2])
filtered_df
#> gene group1 group2 p_value
#> 1 ABCD1 Control Drug_1 0.4
Created on 2018-06-29 by the reprex package (v0.2.0).