-1

I would like to get explanation on the R's dplyr filter behavior below:

df <- data.frame( x = rep('test',3), y = c('service','audio','video') )

filter(df , y == 'service')
#result 1
x      y
test service

filter(df , 'service' %in% y)
#result 2
x      y
test service
test audio
test video

Can I get explanation on above behavior? I want to filter out the word service in column 'y'. I do not understand why the row with 'audio' and 'video' get filtered too.

EDIT: I do not understand why I am being flagged down for having this question. I am aware of the difference between '==' and '%in%'. I do not ask the difference between '==' and '%in%' in general. I am wondering why my code does not give the wanted output when using %in% IN dplyr's filter. I am not using %in% randomly and then asking why it behaves that way afterward. Again I am aware of what %in% does. Please see through my question instead of seeing the header only.

EDIT2: As per suggestion, I am changing my header to indicate that my question is different from existing question with similar header.

addicted
  • 2,901
  • 3
  • 28
  • 49
  • 4
    `"service" %in% df$y` just gives you a single value, `TRUE`. There's nothing about that expression that makes it go through the rows one by one, it just says "is the value `'service'` in the vector `y`?". – Marius Jun 13 '17 at 05:32
  • Perhaps edit the title. As we learned, the issue was incorrect order within the filter argument, which is a different issue to the suggested duplicate. – neilfws Jun 13 '17 at 12:42
  • @neilfws Thanks for the suggestion. I am trying it. – addicted Jun 14 '17 at 02:21
  • I have edited my question. Will people take back their downvotes? – addicted Jun 23 '17 at 02:40

1 Answers1

4

Basically, your %in% is the wrong way around. But there's not much point in using %in% unless you have a character vector with more than one value.

df %>% 
  filter(y %in% "service")
  # %in% c("service", "...", "...") would be more usual
neilfws
  • 32,751
  • 5
  • 50
  • 63
  • thanks man. I feel stupid in mixing that up. Initially there will be just one value but I want to create a vector variable so I can do this: (y %in% char_vector) – addicted Jun 13 '17 at 05:47