I have two df's that look like this,
id1<-c(2,2,2,1,3)
id2<-c(4,4,5,2,3)
OS<-c(1,2,3,4,5)
a <- cbind(OS, id1, id2)
a
i1<-c(2,2,2,4,3)
i2<-c(4,4,5,2,3)
OSi<-c(1,2,3,4,5)
b <- cbind(OSi, i1, i2)
I want to create a new col (x) with a '0' where a$OS & a$OSi match (i.e. 1->1, 2->2) but only where the subsequent two cols in each df are in agreement, and '1' where they don't. My desired output is something like this:
id1<-c(2,2,2,1,3)
id2<-c(4,4,5,2,3)
OS<-c(1,2,3,4,5)
x <-c(0,0,0,1,0)
a <- cbind(OS, id1, id2, x)
a
I have so far tried this:
a$x <- case_when(a$OS %in% b$OSi &
a$id1 %in% b$I1 &
b$id2 %in% b$I2 ~ '0') %>%
replace_na('1')
However, I find that Case_when only looks to satisfy the first argument here. (i.e. the ratio of 0/1 change depending on the order of the variables).
Also, the above is a top df. My actual data is ~10000 rows and b is longer than a. I may also want to drop any rows not in a, but in b.
I hope that's clear enough. Dplyr my preference to integrate into pipes!
Cheers!