1

I use dplyr's filter() function all the time for tidying my data. Today it has stopped working when using the | operator. I am certain I have been able to use the | to filter any observation that meets any of the criteria separated by the | but it isn't working all of a sudden. Any help/guidance is greatly appreciated, as always. Reprex is below.

library(tidyverse)
#> Warning: package 'tibble' was built under R version 3.6.2
#> Warning: package 'tidyr' was built under R version 3.6.2
#> Warning: package 'purrr' was built under R version 3.6.2
id <- c(1:20)
YEAR <- c(2009,2009,2009,2009,2010,2010,2010,2010,2011,2011,2011,2011,2012,2012,2012,2012,2013,2013,2013,2013)
df1 <- data.frame(id,YEAR)
df1
#>    id YEAR
#> 1   1 2009
#> 2   2 2009
#> 3   3 2009
#> 4   4 2009
#> 5   5 2010
#> 6   6 2010
#> 7   7 2010
#> 8   8 2010
#> 9   9 2011
#> 10 10 2011
#> 11 11 2011
#> 12 12 2011
#> 13 13 2012
#> 14 14 2012
#> 15 15 2012
#> 16 16 2012
#> 17 17 2013
#> 18 18 2013
#> 19 19 2013
#> 20 20 2013
df1 <- df1 %>% dplyr::filter(YEAR == 2009|2010)
df1
#>    id YEAR
#> 1   1 2009
#> 2   2 2009
#> 3   3 2009
#> 4   4 2009
#> 5   5 2010
#> 6   6 2010
#> 7   7 2010
#> 8   8 2010
#> 9   9 2011
#> 10 10 2011
#> 11 11 2011
#> 12 12 2011
#> 13 13 2012
#> 14 14 2012
#> 15 15 2012
#> 16 16 2012
#> 17 17 2013
#> 18 18 2013
#> 19 19 2013
#> 20 20 2013

Expected results would be:

df1 <- df1 %>% dplyr::filter(YEAR == 2009|2010)
df1
#>    id YEAR
#> 1   1 2009
#> 2   2 2009
#> 3   3 2009
#> 4   4 2009
#> 5   5 2010
#> 6   6 2010
#> 7   7 2010
#> 8   8 2010

The following works filtering on a single condition:

df1 <- df1 %>% dplyr::filter(YEAR == 2009)
df1
#>   id YEAR
#> 1  1 2009
#> 2  2 2009
#> 3  3 2009
#> 4  4 2009
Abe
  • 393
  • 2
  • 13

2 Answers2

1

We can use %in% instead of == for more than one element

library(dplyr)
df1 %>% 
    dplyr::filter(YEAR %in% c(2009, 2010))

With |, we need to repeat

df1 %>%
    dplyr::filter(YEAR == 2009|YEAR == 2010)

Any value greater than 0 with another, gives TRUE

2019|2020
#[1] TRUE

0|0
#[1] FALSE
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    I figured it must have been something simple I'd overlooked. Apologies for my ignorance. – Abe Jun 08 '20 at 21:30
1

I think also your way would work with...

df1 <- df1 %>% dplyr::filter(YEAR == 2009|YEAR == 2010)

I think of it as two separate arguments. If you use each individually, the filter would work. In your provided YEAR == 2009|2010, the second part would simply be filter(2010), which doesn't make sense.

TTS
  • 1,818
  • 7
  • 16
  • Perfect, thank you. I designated akrun's response as the designated answer because it provides two possible solutions. Thank you! – Abe Jun 08 '20 at 21:31