0

i think its an regular problem, answered several times, but I just don't know how to ask the question right =(

in MySQL:
i have 2 tables with some kind of strings inside, now i want:
1. the data that occur in both tables
2. the data from a that is not in table b

same in R:
i have 2 R data.frame s and i want:
1. the data that occur in a and b
2. the data that occur in a but not in b

TekTimmy
  • 3,066
  • 2
  • 29
  • 33
  • R: So your two data frames are 'a' and 'b'? And you want to find the instances where the whole row of the data frame (which could be several values) are the same between 'a' and 'b'? – Spacedman Sep 29 '10 at 13:28
  • Really, you could perform both of these operations in R using RMySQL – Brandon Bertelsen Sep 29 '10 at 16:00
  • joshua and ErVeY got me right, they gave the solution! both in R?? why same thing twice ? just wanted to know how it works in both =) – TekTimmy Oct 01 '10 at 11:19

2 Answers2

3

in mysql you could do this to get the data in both tables

SELECT * FROM table1 t1 INNER JOIN table2 t2 ON t1.id = t2.id

and do this to get the tada from a that is not in table b

SELECT * FROM table1 t1 LEFT JOIN table2 t2 ON t1.id = t2.id
WHERE t2.id IS NULL
ErVeY
  • 1,524
  • 4
  • 16
  • 26
3

In R:

a <- data.frame(V1=sample(letters[1:3],20,TRUE),V2=rnorm(20))
b <- data.frame(V1=sample(letters[2:4],20,TRUE),V2=rnorm(20))

# the data that occur in a and b
(ab <- merge(a,b,by="V1"))

# the data that occur in a but not in b 
aNOTb <- merge(a,b,by="V1",all=TRUE)
(aNOTb <- aNOTb[is.na(aNOTb$V2.y),])
Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418