I have two data.frame
s which have a 3 columns:
1. id
- a unique key
target
- semicolon separated unique valuessource
- similar for each of the data frames but different for the twodata.frame
s.
Here's simulated data:
set.seed(1)
df.1 <- data.frame(id=LETTERS[sample(length(LETTERS),10,replace=F)],
target=sapply(1:10,function(x) paste(LETTERS[sample(length(LETTERS),5,replace=F)],collapse=";")),
source="A",stringsAsFactors=F)
df.2 <- data.frame(id=LETTERS[sample(length(LETTERS),5,replace=F)],
target=sapply(1:5,function(x) paste(LETTERS[sample(length(LETTERS),5,replace=F)],collapse=";")),
source="B",stringsAsFactors=F)
I'm looking for a function that will collapse the two data.frame
s together and will create 3 columns:
1.intersected.targets
- semicolon separated unique values that are intersected between the two data.frame
s
2.source1.targets
- targets that are unique to the first data.frame
3.source2.targets
- targets that are unique to the second data.frame
So for the example above the resulting data.frame
will be:
> res.df
id intersected.targets sourceA.targets sourceB.targets
1 G NA F;E;Q;I;X <NA>
2 J NA M;R;X;I;Y <NA>
3 N NA Y;F;P;C;Z <NA>
4 U NA K;A;J;U;H <NA>
5 E NA M;O;L;E;S <NA>
6 S NA R;T;C;Q;J <NA>
7 W NA V;Q;S;M;L <NA>
8 M NA U;A;L;Q;P <NA>
9 B NA C;H;M;P;I <NA>
10 X NA <NA> G;L;S;B;T
11 H NA <NA> I;U;Z;H;K
12 Y NA <NA> L;R;J;H;Q
13 O NA <NA> F;R;C;Z;D
14 L V M;K;F;B X;J;R;Y