3

Suppose I have the following data.

x<- c(1,2, 3,4,5,1,3,8,2)
y<- c(4,2, 5,6,7,6,7,8,9)
data<-cbind(x,y)

    x   y
1   1   4
2   2   2
3   3   5
4   4   6
5   5   7
6   1   6
7   3   7
8   8   8
9   2   9  

Now, if I subset this data to select only the observations with "x" between 1 and 3 I can do:

s1<- subset(data, x>=1 & x<=3)

and obtain my desired output:

    x   y
1   1   4
2   2   2
3   3   5
4   1   6
5   3   7
6   2   9

However, if I subset using the colon operator I obtained a different result:

s2<- subset(data, x==1:3)

    x   y
1   1   4
2   2   2
3   3   5

This time it only includes the first observation in which "x" was 1,2, or 3. Why? I would like to use the ":" operator because I am writing a function so the user would input a range of values from which she wants to see an average calculated over the "y" variable. I would prefer if they can use ":" operator to pass this argument to the subset function inside my function but I don't know why subsetting with ":" gives me different results.

I'd appreciate any suggestions on this regard.

MrFlick
  • 195,160
  • 17
  • 277
  • 295
dll
  • 97
  • 7

1 Answers1

3

You can use %in% instead of ==

 subset(data, x %in% 1:3)

In general, if we are comparing two vectors of unequal sizes, %in% would be used. There are cases where we can take advantage of the recycling (it can fail too) if the length of one of the vector is double that of the second. Some examples with some description is here.

Community
  • 1
  • 1
akrun
  • 874,273
  • 37
  • 540
  • 662
  • @davidll Thanks for the feedback. When you are comparing two vectors of unequal sizes or a vector with a another having length >1, `==` will not work – akrun Jun 12 '15 at 16:47
  • 1
    @akrun `1:4 == 1:2` "works" in a sense. I guess the best way to be precise about it is to reference some documentation (though I don't know which) – Frank Jun 12 '15 at 18:19
  • @Frank Yes, by recycling. but `1:4 == 1:3` will have some warning – akrun Jun 12 '15 at 18:21
  • @Frank Another case where the recycling may fail is `c(1:2,2:1)==1:2`. It again depends on what the expected result would be. – akrun Jun 12 '15 at 18:29
  • Okay, yeah, I just meant that some explanation would be good (and now see that you've added it). Maybe that other question is a dupe...? Anyway, I'll leave that to you. – Frank Jun 12 '15 at 18:33
  • 1
    @Frank It could be closed as a dupe but the title is based on a warning message. If we can find a better dupe, it is good. – akrun Jun 12 '15 at 18:34