I ran this code to get the breakdown of ethnicities in my sample:
dataset %>%
group_by(ethnicity) %>%
summarise(percent = 100 * n()/nrow(datset))
However, because subjects were able to select multiple ethnicity categories on their questionnaire, the results came out like this:
1 "[\"Aboriginal or Torres Strait Islander\",\"Caucasian\",\"Asian (inc. Indian subcontinent)\"]" 0.364
2 "[\"Aboriginal or Torres Strait Islander\",\"Caucasian\"]" 0.0910
3 "[\"Aboriginal or Torres Strait Islander\"]" 0.910
4 "[\"African\"]" 0.637
5 "[\"Asian (inc. Indian subcontinent)\"]" 0.0910
9 "[\"Caucasian\",\"Latino/Hispanic\"]" 0.182
10 "[\"Caucasian\",\"Middle Eastern\"]" 0.273
11 "[\"Caucasian\",\"Not listed\"]" 0.182
etc.
What would be the best/most efficient way to get a breakdown of the individual (non-combined) categories?
I basically just want a percentage breakdown of:
Caucausian -
African -
Latino/Hispanic -
Aboriginal or Torres Strait Islander -
Middle Eastern -
Etc.