3

I have a data frame left with columns name and x, and I want to match the missing x entries with the corresponding name entries from a data frame right:

left = data.frame(
  name = c("Bob", "Billy", "Roger", "Anna", "Kathy"),
  x = c(NA, NA, NA, 4, 5)
)

# > left
#   name  x
# 1 Bob   NA
# 2 Billy NA
# 3 Roger NA
# 4 Anna   4
# 5 Kathy  5

right = data.frame(
  name = c("Bob", "Billy", "Roger"),
  x = c(1, 2, 3)
)

# > right
#   name  x
# 1 Bob   1
# 2 Billy 2
# 3 Roger 3

A left_join gives me two columns

> left_join(left, right, by="name")
   name x.x x.y
1 Bob    NA   1
2 Billy  NA   2
3 Roger  NA   3
4 Anna    4  NA
5 Kathy   5  NA

But what I want is:

    name x 
1 Bob     1   
2 Billy   2   
3 Roger   3   
4 Anna    4  
5 Kathy   5  

Can I do this with a join command?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
luffe
  • 1,588
  • 3
  • 21
  • 32

3 Answers3

4
library(data.table)

setDT(left)[right, on = 'name', x := i.x]
left
#    name x
#1:   Bob 1
#2: Billy 2
#3: Roger 3
#4:  Anna 4
#5: Kathy 5
eddi
  • 49,088
  • 6
  • 104
  • 155
3

Another one

library(dplyr)
left_join(left, right, "name") %>% 
  transmute(name, x=coalesce(x.x, x.y))
#    name x
# 1   Bob 1
# 2 Billy 2
# 3 Roger 3
# 4  Anna 4
# 5 Kathy 5

(via)

Community
  • 1
  • 1
lukeA
  • 53,097
  • 5
  • 97
  • 100
2

Here's one possible solution

> out <- na.omit(merge(left, right, by=c("name", "x"), all=TRUE))
> out[order(out$x),] # to get your same order
   name x
4   Bob 1
2 Billy 2
7 Roger 3
1  Anna 4
6 Kathy 5
Jilber Urbina
  • 58,147
  • 10
  • 114
  • 138
  • 1
    (A minor thing:) This depends on `right` being able to fill all the missing values in `left`. – Frank Sep 06 '16 at 21:45