I'm trying to calculate the incidence/percentage of a binary variable in relation to a variable that contains 5 (+ one NA) different income brackets. I'm using:
afghan %>% group_by(income) %>%
summarize(violent.exp.ISAF = n()) %>%
mutate(Percentage = violent.exp.ISAF/sum(violent.exp.ISAF)*100)
But this is giving me the general percentage of the binary variables in relation to the whole table and not just within that specific income bracket, like this:
# income violent.exp.taliban Percentage
# <chr> <int> <dbl>
#1 10,001-20,000 616 22.4
#2 2,001-10,000 1420 51.6
#3 20,001-30,000 93 3.38
#4 less than 2,000 457 16.6
#5 over 30,000 14 0.508
#6 NA 154 5.59
And I wanted to have the percentage of the binary variable just within that specific income bracket. Any advice?
A sample of the afghan dataset:
> dput(head(afghan))
structure(list(province = c("Logar", "Logar", "Logar", "Logar",
"Logar", "Logar"), district = c("Baraki Barak", "Baraki Barak",
"Baraki Barak", "Baraki Barak", "Baraki Barak", "Baraki Barak"
), village.id = c(80, 80, 80, 80, 80, 80), age = c(26, 49, 60,
34, 21, 18), educ.years = c(10, 3, 0, 14, 12, 10), employed = c(0,
1, 1, 1, 1, 1), income = c("2,001-10,000", "2,001-10,000", "2,001-10,000",
"2,001-10,000", "2,001-10,000", NA), violent.exp.ISAF = c(0,
0, 1, 0, 0, 0), violent.exp.taliban = c(0, 0, 0, 0, 0, 0), list.group = c("control",
"control", "control", "ISAF", "ISAF", "ISAF"), list.response = c(0,
1, 1, 3, 3, 2)), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))