5

If you have a vector of strings and you want to know which match. It's a simple matter of using %in%.

x <- c("red","blue","green")
y <- c("yellow","blue","orange")

which(x %in% y) # Literally, which X are in Y.

But what about the opposite, where you would like to find which X are not in Y?

Brandon Bertelsen
  • 43,807
  • 34
  • 160
  • 255

2 Answers2

8

A neat way that I like (that I learnt from @joran, iirc) is:

`%nin%` <- Negate(`%in%`)
which(x %nin% y)
[1] 1 3    
Arun
  • 116,683
  • 26
  • 284
  • 387
5

Doing %in% returns a vector of trues and falses. Using an exclamation mark will turn those Ts and Fs around, and by wrapping everything in which will give you indices.

> which(!x %in% y)
[1] 1 3
> which(x %in% y)
[1] 2
Roman Luštrik
  • 69,533
  • 24
  • 154
  • 197
  • 1
    Also, if you are going to do `x[which(!x %in% y)]`, then you might prefer `setdiff(x, y)`. The latter is also applying `unique`. – flodel Apr 05 '13 at 23:16
  • 2
    I may be paranoid, but I'd humbly suggest `which(!(x %in% y))`, to improve readability. – Ferdinand.kraft Apr 06 '13 at 01:09
  • @Ferdinand.kraft I agree with you and I like the parenthesis too. However, that's one more reason why the `%ni%` suggestion below is so great! – Ricardo Saporta Apr 06 '13 at 04:06