18

What is the best way to filter rows from data frame when the values to be deleted are stored in a vector? In my case I have a column with dates and want to remove several dates.

I know how to delete rows corresponding to one day, using !=, e.g.:

m[m$date != "01/31/11", ]

To remove several dates, specified in a vector, I tried:

m[m$date != c("01/31/11", "01/30/11"), ]

However, this generates a warning message:

Warning message:
In `!=.default`(m$date, c("01/31/11", "01/30/11")) :
longer object length is not a multiple of shorter object length
Calls: [ ... [.data.frame -> Ops.dates -> NextMethod -> Ops.times -> NextMethod

What is the correct way to apply a filter based on multiple values?

Henrik
  • 65,555
  • 14
  • 143
  • 159
matt_k
  • 4,139
  • 4
  • 27
  • 33

4 Answers4

40

nzcoops is spot on with his suggestion. I posed this question in the R Chat a while back and Paul Teetor suggested defining a new function:

`%notin%` <- function(x,y) !(x %in% y) 

Which can then be used as follows:

foo <- letters[1:6]

> foo[foo %notin% c("a", "c", "e")]
[1] "b" "d" "f"

Needless to say, this little gem is now in my R profile and gets used quite often.

Community
  • 1
  • 1
Chase
  • 67,710
  • 18
  • 144
  • 161
14

I think for that you want:

m[!m$date %in% c("01/31/11","01/30/11"),]
nzcoops
  • 9,132
  • 8
  • 41
  • 52
4

cool way is to use Negate function to create new one:

`%ni%` <- Negate(`%in%`) 

than you can use it to find not intersected elements

2

In regards to some of the questions above, here is a tidyverse compliant solution. I used anti_join from dplyr to achieve the same effect:

library(tidyverse)

numbers <- tibble(numbers = c(1:10))
numbers_to_remove <- tibble(number = c(3, 4, 5))

numbers %>%
  anti_join(numbers_to_remove)
Ben G
  • 4,148
  • 2
  • 22
  • 42