I am trying to create a variable that is a logical value when comparing one character string to more than two other character strings in a data.table and I need to ignore NA's.
Sample data for D2:
structure(list(ID = c("a001", "a002", "a003"), var1 = c("char1",
"char1", "char2"), var2 = c("char1", NA, "char2"), var3 = c("char1",
"char1", "char1")), row.names = c(NA, -3L), class = c("data.table",
"data.frame"), .internal.selfref = <pointer: 0x0000015eb1261ef0>)
Attempted a proposed solution below:
D2[, Match := apply(sapply(.SD, `==`, D2[, "var1"]), 1, any), .SDcols =
c("var2", "var3")]
Result for a003 is TRUE whereas it should be FALSE because var1 and var3 don't match:
structure(list(ID = c("a001", "a002", "a003"), var1 = c("char1",
"char1", "char2"), var2 = c("char1", NA, "char2"), var3 = c("char1",
"char1", "char1"), Match = c(TRUE, TRUE, TRUE)), row.names = c(NA,
-3L), class = c("data.table", "data.frame"), .internal.selfref = <pointer:
0x0000015eb1261ef0>)
Desired Result:
structure(list(ID = c("a001", "a002", "a003"), var1 = c("char1",
"char1", "char2"), var2 = c("char1", NA, "char2"), var3 = c("char1",
"char1", "char1"), Match = c(TRUE, TRUE, FALSE)), row.names = c(NA,
-3L), class = c("data.table", "data.frame"), .internal.selfref = <pointer:
0x0000015eb1261ef0>)