I have a tibble with information about diagnoses:
data <- tibble(
id = c(1:10),
diagnosis_1 = c("F32", "F431", "R58", "S32", "F11", NA, NA, "Y67", "F32", "Z032"),
diagnosis_2 = c(NA, NA, NA, NA, NA, NA, "G35", NA, NA, NA),
diagnosis_3 = c("F40", NA, "R67", "F431", NA, "F60", "S58", "R68", "F11", NA),
diagnosis_4 = c(NA, NA, "F65", NA, "F19", NA, NA, "F32", NA, NA)
)
As a part of the cleaning process, I have removed all diagnoses not fulfilling certain criteria (i.e. not starting with the letter F, G, or Z). With the following code:
data$diagnosis_1[str_sub(data$diagnosis_1, 1,1) %in% c("R", "S", "Y")] <- NA
data$diagnosis_2[str_sub(data$diagnosis_2, 1,1) %in% c("R", "S", "Y")] <- NA
data$diagnosis_3[str_sub(data$diagnosis_3, 1,1) %in% c("R", "S", "Y")] <- NA
data$diagnosis_4[str_sub(data$diagnosis_4, 1,1) %in% c("R", "S", "Y")] <- NA
Ending up with this tibble:
I now need to move the data to the left to fill the columns from left to right (i.e diagnosis_1 not being empty if diagnosis_2, diagnosis_3 or diagnosis_4 has data). I have tried using ifelse() as it is vectorized but I can`t seem to get it to work with several nested ifelse().
ifelse(is.na(data$diagnosis_1), data$diagnosis_2, data$diagnosis_1))
All suggestions are much appreciated.
Edit: adding expected output: