18

I tried to select the rows based on their ID. For example, in a data frame called test, ID 201 has 6 rows of data, ID 202 has 6 rows of data too, and 203, 204..... etc.

Now I only want to extract 201 and 202 from the dataset, so it should have 12 rows altogether. However

out <- test[test$ID==c(201,202), ]
out <- subset(test, ID==c(201,202))

only returns three 201 and three 202, which are Row 1, Row 3, Row of 5 8 10 12.

Can anyone provide some suggestions that how I can do this in R?

joran
  • 169,992
  • 32
  • 429
  • 468
Fred
  • 579
  • 2
  • 4
  • 13
  • 4
    In case you're wondering *why* you got what you did, `==` compares element-wise and recycles one vector if it runs out. So it just alternated checking the ID column with 201 and 202. The `%in%` answer is best, but you also could have used `subset(test, ID == 201 | ID == 202)` – Gregor Thomas Dec 06 '11 at 06:38

1 Answers1

34

You want %in%, not ==.

out <- test[test$ID %in% c(201, 202), ]
out <- subset(test, ID %in% c(201, 202))
Hong Ooi
  • 56,353
  • 13
  • 134
  • 187
  • 1
    @MattDowle this is a great candidate for the data.table-intro vignette or perhaps the FAQ! I definitely stepped on this land mine last weekend. – bright-star Jan 26 '14 at 02:59