4

I've done quite a bit of reading on this but I've not been able to get an answer that works yet.

I've been using the setdiff function in R to look at the number of matches between two dataframes. I know that I have 71 out of 200 observations matching and the remainder non-matching.

So far I've just done this to get the number of matching and non-matching values:

check = setdiff(dataset1$variable1, dataset2$variable1)

How do I return a list of the matching and non-matching values?

Thanks,

Ed

Thirst for Knowledge
  • 1,606
  • 2
  • 26
  • 43

1 Answers1

6

All the matching values are found with the intersect function, from the Set Operations. All the values in both variables are found with the union function. So the values that are in the union, but not in the intersect are non-matching.

var1 <- LETTERS[1:5]
var2 <- LETTERS[4:8]
matched <- intersect(var1, var2)
all <-  union(var1, var2)
non.matched <- all[!all %in% matched]
Edwin
  • 3,184
  • 1
  • 23
  • 25
  • Thanks Ediwn, I've managed to do it now using these two functions. One other question, in your code, what do 'LETTERS [1:5]' and 'LETTERS[4:8]' mean? – Thirst for Knowledge Feb 17 '14 at 11:21
  • The letters of the alphabet are built in in R. Entering LETTERS returns all the 26 capital letters. Within the square brackets you create a subset. So var1 is A up untill E, and var2 is D up untill H. – Edwin Feb 17 '14 at 11:38