I'm trying to create a custom function that generates new binary variables in an existing dataframe. The idea is to be able to feed the function with the diagnosis description (string), ICD9 diagnosis code (number), and patient database. The function would then generate new variables for all the diagnosis of interest and assign a 0 or 1 if the patient (row or observation) has the diagnosis.
Below are the function variables:
x<-c("2851") #ICD9 for Anemia
y<-c("diag_1") #Primary diagnosis
z<-"Anemia" #Name of new binary variable for patient dataframe
i<-patient_db #patient dataframe
patient<-c("a","b","c")
diag_1<-c("8661", "2851","8651")
diag_2<-c("8651","8674","2866")
diag_3<-c("2430","3456","9089")
patient_db<-data_frame(patient,diag_1,diag_2,diag_3)
patient diag_1 diag_2 diag_3
1 a 8661 8651 2430
2 b 2851 8674 3456
3 c 8651 2866 9089
Below is the function:
diagnosis_func<-function(x,y,z,i){
pattern = paste("^(", paste0(x, collapse = "|"), ")", sep = "")
i$z<-ifelse(rowSums(sapply(i[y], grepl, pattern = pattern)) != 0,"1","0")
}
This is what I would like to get at after running the function:
patient diag_1 diag_2 diag_3 Anemia
1 a 8661 8651 2430 0
2 b 2851 8674 3456 1
3 c 8651 2866 9089 0
The lines within the function have been tested outside the function and are working. Where I'm stuck is trying to get the function working. Any help would be greatly appreciated.
Happy New Year
Albit