4

I'm trying to use dplyr's full_join to combine two data.frames, for example:

col1 = 'b'
col2 = 'd'

df1 = data.frame(a = 1:3, b = 1:3)
df2 = data.frame(a = 1:3, d = 1:3)


full_join(df1, df2, c('a' = 'a', col1 = col2))

but it returns

Error: by can't contain join column col1 which is missing from LHS

I'm looking for an output similar to

merge(df1, df2, by.x = c('a', col1), by.y = c('a', col2))
  a b
1 1 1
2 2 2
3 3 3
Rafael
  • 3,096
  • 1
  • 23
  • 61
  • 4
    Possible duplicate of [Dplyr join on by=(a = b), where a and b are variables containing strings?](https://stackoverflow.com/questions/28399065/dplyr-join-on-by-a-b-where-a-and-b-are-variables-containing-strings) – Rafael Mar 05 '18 at 13:11
  • What's wrong with `merge()`? – jay.sf Mar 05 '18 at 14:01
  • I think it changes the order, it ruins `geom_polygon` plots – Rafael Mar 05 '18 at 15:45

4 Answers4

4

You can use rename_, i.e.,

library(dplyr)

full_join(df1, rename_(df2, .dots = setNames(col2, col1)))

which gives,

#Joining, by = c("a", "b")
  a b
1 1 1
2 2 2
3 3 3

Posting alternatives as per @akrun and @mt1022 comments,

#akrun
full_join(df1, rename_at(df2, .vars = col2, funs(paste0(col1))))
full_join(df1, rename(df2, !!(col1) := !!rlang::sym(col2)))

#mt1022
full_join(df1, rename_at(df2, col2, ~col1))
Sotos
  • 51,121
  • 6
  • 32
  • 66
1

Change the join like this:

full_join(df1, df2, by=c('b'='d'))  
a.x b a.y
1   1 1   1
2   2 2   2
3   3 3   3
Terru_theTerror
  • 4,918
  • 2
  • 20
  • 39
1

All credits to @MrFlick in the duplicate link, slightly modified for OP's example :

full_join(df1, df2, by = c("a",setNames(col2, col1)))
moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
0

This will reproduce your result, is this what you look for though?

full_join(df1, df2, by="a")%>%select(-d)
Antonios
  • 1,919
  • 1
  • 11
  • 18