It's right there in the documentation for ?dplyr
(although it seems like this was only added to the documentation 9 months ago):
Use filter() find rows/cases where conditions are true. Unlike base subsetting, rows where the condition evaluates to NA are dropped.
This is consistent with the way base::subset()
works, but not how subsetting with [
+logical indexing works.
As @akrun says in comments, you can use filter(mydf, y != 'a' |is.na(y))
to preserve NA
values. It would be nice to be able to use identical()
or isTRUE()
, but these aren't vectorized. You could write a convenience wrapper:
eq <- function(x,c) {x==c | is.na(x)}
filter(mydf,eq(y,"a"))