1

I'm trying to see if I can use a loop using apply or purrr functions to loop through columns to filter the same data value. For example, I'm looking for a way to loop through columns Q2_1:Q2_10. Currently, I'm doing this in a highly inefficient way:

data %>% 
  filter(Q2_1 == 5) %>% 
  filter(Q2_2 == 5) %>% 
  filter(Q2_3 == 5) %>% 
  filter(Q2_4 == 5) %>% 
  filter(Q2_5 == 5) %>% 
  filter(Q2_6 == 5) %>% 
  filter(Q2_7 == 5) %>% 
  filter(Q2_8 == 5) %>% 
  filter(Q2_9 == 5) %>% 
  filter(Q2_10 == 5)

I'm not sure how to get started with the lapply or with the tidyverse way. Should I convert the filter into a function or just add the filter function into the loop?

writer_typer
  • 708
  • 7
  • 25

2 Answers2

2

Here is a starting point:

cols = c("Q2_1","Q2_2","Q2_3","Q2_4","Q2_5","Q2_6","Q2_7","Q2_8","Q2_9","Q2_10")

for(col in cols){
  data = data %>%
    filter(!!sym(col) == 5)
}

Note that cols is an array of character strings, so we use !!sym(.) to turn each character string into an R variable for evaluation. This feature is from the rlang package.

There are other ways to do this. See the tidyverse guidance on programming with dplyr for other options & explanation.

Simon.S.A.
  • 6,240
  • 7
  • 22
  • 41
  • I like this! Is there a way to loop over each column instead of the `cols = c("Q2_1", "Q2_2" . . . `? – writer_typer Jul 09 '21 at 03:30
  • 1
    You could use `cols = colnames(df)` to loop over all columns. Otherwise you could look into pattern matching with `grepl`. – Simon.S.A. Jul 09 '21 at 05:59
2

You can use if_all -

library(dplyr)

data %>% filter(if_all(Q2_1:Q2_10, ~. == 5))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213