I have a look-up table of "firsts" in column d. For example, the first time the patient was admitted because of a specific disease. I would like to join this back into the main data frame via data.table on multiple other conditions.
My problem is that, unfortunately, the main data.table could have multiple records with identical joining criteria that results in multiple "firsts" per patient after the join. Real world data is messy, people!
Is it possible to do a {data.table} join on only the first matching record?
This is similar to this question, but the multiple-matches are on the main data table. I think that mult
only works on when there are several entries on the table being joined in.
repex:
library(data.table)
set.seed(1724)
d1 <- data.table(a = c(1, 1, 1),
b = c(1, 1, 2),
c = sample(1:10, 3))
d2 <- data.table(a = 1, b = 1, d = TRUE)
d2[d1, on = c("a", "b")]
a b d c
1: 1 1 TRUE 4
2: 1 1 TRUE 8
3: 1 2 NA 2
desired output
a b d c
1: 1 1 TRUE 4
2: 1 1 NA 8
3: 1 2 NA 2