0

Assume that given data.table da, there is another data.table, db which has some column names equal to the column names of da. Some of those columns with equal names also have identical content. My question is how to replace those entries in da that have columns matching with db with content of db?

Much simpler put:

require(data.table)
da <- data.table(a=1:10,b=10:1,c=LETTERS[1:10])

## All (a,b) pairs in db exist in da, but c is different.
db <- data.table(a=c(2,6,8),b=c(9,5,3),c=c('x','y','z')) 

## Data table dx will have c-column values of  db$c where (a,b) matches between da and db.
dx <- db[da,.(a,b,c=fifelse(is.na(c),i.c,c),on=c('a','b')]

Output

> dx
     a  b c
 1:  1 10 A
 2:  2  9 x
 3:  3  8 C
 4:  4  7 D
 5:  5  6 E
 6:  6  5 y
 7:  7  4 G
 8:  8  3 z
 9:  9  2 I
10: 10  1 J
> da
     a  b c
 1:  1 10 A
 2:  2  9 B
 3:  3  8 C
 4:  4  7 D
 5:  5  6 E
 6:  6  5 F
 7:  7  4 G
 8:  8  3 H
 9:  9  2 I
10: 10  1 J
> db
   a b c
1: 2 9 x
2: 6 5 y
3: 8 3 z
> dx
     a  b c
 1:  1 10 A
 2:  2  9 x
 3:  3  8 C
 4:  4  7 D
 5:  5  6 E
 6:  6  5 y
 7:  7  4 G
 8:  8  3 z
 9:  9  2 I
10: 10  1 J
> 

I know that the above achieves my goal, but it feels clumsy. Is there a built-in data.table way to do this?

Mali Remorker
  • 1,206
  • 11
  • 20

1 Answers1

0

Okay, I figured it

dx <- da[db,c:=i.c,on=c('a','b')]
Mali Remorker
  • 1,206
  • 11
  • 20
  • As a sidenote, `data.table` has sometimes confusing notation. My first solution -- which is a join -- is of the form `db[da,...,on=...]`. My second solution, which is a sort of a subset and replace is `da[db,...,on=...]`. – Mali Remorker Mar 29 '21 at 15:31