Good evening, I am trying to merge a couple datasets and my normal tools in R are failing me tonight. Consider df1 and df2 below.
df1 = data.frame(a = c("a", "b", "c"),
b = c("1", "2", "3"),
c = c("x", "y", "z"))
df2 = data.frame(a = c("1", "b", "c", "d", "e"),
b = c("a", "2", "3", "4", "5"),
d = c("x2", "y2", "z2", "x3", "y3"))
In both cases, column a and b are supposed to act as grouping variables. For example, in df1, when a = a and b = 1, then c = x. Given the structure of the data I am working with, the actual order of a and b does not matter such that if a were to = 1 and b = a, then c will still equal x. Herein lies the problem, I would like to merge df1 with a new df, df2. df2, is similarly structured, but contains a new variable d. And, as can be seen df2 includes some a and b combinations that are backwards compared to A. In addition, B has some additional observations.
The desired dataframe I am looking for looks like this:
desired = data.frame(a = c("a", "b", "c"),
b = c("1", "2", "3"),
c = c("x", "y", "z"),
d = c("x2", "y2", "z2"))
As can be seen the original column structure from a b and c are preserved, and we have added in column D. However, we have not added any new observations.
I have tried using merge()
with varying combinations of by.x
, by.y
.
I also tried using various left_join
and inner_join
but I keep on getting whaping data sets that still aren't handling the mismatch in the a/b columns.
Thanks for any thoughts or help you might be able to provide.
Cheers