Hi I have this dataset here: diagnoses_2_or_more and diagnoses_3_or_more are categorical where 1 indicates yes and 0 indicates no.
id <- c(1,2,3,4,5,6,7)
grp <- c("1","1","1","2","2","3","3")
diagnosis_2_or_more <- c(1,1,0,1,0,1,0)
diagnosis_3_or_more <- c(1,0,1,1,1,0,1)
df <- data.frame(id,grp,diagnosis_2_or_more,diagnosis_3_or_more)
I want to calculate the percentage of people who have 2 or more diagnoses and who have 3 or more diagnoses for each group.
The desired dataset would look like this:
id <- c(1,2,3,4,5,6,7)
grp <- c("1","1","1","2","2","3","3")
diagnosis_2_or_more <- c(1,1,0,1,0,1,0)
diagnosis_3_or_more <- c(1,0,1,1,1,0,1)
perc_2_or_more <- c(0.67,0.67,0.67,0.5,0.5,0.5,0.5)
perc_3_or_more <- c(0.67,0.67,0.67,0.5,1,0.5,0.5)
df <- data.frame(id,grp,diagnosis_2_or_more,diagnosis_3_or_more,perc_2_or_more,perc_3_or_more)
For example for group 1, percentage of people who have 2 or more diagnoses would be calculated as 2/3 (2: number of people who have 2 or more diagnoses [coded as 1], total people in group 1: 3).
Is there a way to do this with group by and summarize or by any other way?
I would appreciate all the help there is! Thanks!!!