1

I am trying to merge two data frames of different lengths without using a unique key.

For example:

Name <- c("Steve","Peter")
Age <- c(10,20)

df1 <- data.frame(Name,Age)

> df1
   Name Age
1 Steve  10
2 Peter  20

Name <-c("Jason","Nelson")
School <-c("xyz","abc")

df2 <- data.frame(Name,School)

> df2
    Name School
1  Jason    xyz
2 Nelson    abc

I want to join these two tables so that I have all columns and have NA cells for rows that didn't have that column originally. It should look something like this:

    Name Age School
1  Steve  10   <NA>
2  Peter  20   <NA>
3  Jason  NA    xyz
4 Nelson  NA    abc

thank you in advance!

  • 1
    `dplyr::bind_rows(df1,df2)`. (Most times `merge` and `join` deal with aligning columns, not with combining rows as here. I'd label this question's action more "append" than "merge". I'm not saying you're wrong, but knowing common nomenclature might inform your own research.) – r2evans May 07 '20 at 16:31
  • 1
    @r2evans thanks for the help. I will look into the tags for more useful research next time. Still new to the stackoverflow platform. thanks anyway! – Steviee kann May 07 '20 at 16:35

1 Answers1

2
dplyr::bind_rows(df1,df2)
# Warning in bind_rows_(x, .id) :
#   Unequal factor levels: coercing to character
# Warning in bind_rows_(x, .id) :
#   binding character and factor vector, coercing into character vector
# Warning in bind_rows_(x, .id) :
#   binding character and factor vector, coercing into character vector
#     Name Age School
# 1  Steve  10   <NA>
# 2  Peter  20   <NA>
# 3  Jason  NA    xyz
# 4 Nelson  NA    abc

You can alleviate some of this by pre-assigning unrecognized columns, which also works well with base R:

df2 <- cbind(df2, df1[NA,setdiff(names(df1), names(df2)),drop=FALSE])
df1 <- cbind(df1, df2[NA,setdiff(names(df2), names(df1)),drop=FALSE])
df1
#       Name Age School
# NA   Steve  10   <NA>
# NA.1 Peter  20   <NA>
df2
#        Name School Age
# NA    Jason    xyz  NA
# NA.1 Nelson    abc  NA

# ensure we use the same column order for both frames
nms <- names(df1)
rbind(df1[,nms], df2[,nms])
#         Name Age School
# NA     Steve  10   <NA>
# NA.1   Peter  20   <NA>
# NA1    Jason  NA    xyz
# NA.11 Nelson  NA    abc
r2evans
  • 141,215
  • 6
  • 77
  • 149