I want to create a population pyramid with stacked bars to show distribution of M and F according to age groups and directly show how many nationals and how many foreigners are in each age groups.
So a bar to the left shows males and the values for nationals and foreigners are stacked on top of each other.
My csv has these columns: Age, Gender, Category, Population. Age is the age buckets, Gender is either M or F, Category is N for national or F for foreigner, population is the total number of people that fits in Age Gender and Category.
My code looks like that:
data <- (read.csv("path/data.csv", sep = ";"))
data$Population <- as.numeric(as.character(data$Population))
data$Population <- ifelse(data$Gender == "M", -1*data$Population, data$Population)
#Pyramid
pyramid <- ggplot(data, aes(x = data$age, y = data$Population, fill = data$Gender)) +
geom_bar(data = subset(data, Gender == "F"), stat = "identity", col = "Black") +
geom_bar(data = subset(data, Gender == "M"), stat = "identity", col = "Black") +
scale_y_continuous(breaks = c(-3000, -2000, -1000, 0, 1000, 2000, 3000), labels = paste0(as.character(c(seq(3000, 0, -1000), seq(1000, 3000, 1000))), "")) +
coord_flip()
pyramid + scale_fill_manual(values = c("#BB381C", "#1C78BB")) + theme_linedraw()
This code produces this pyramid:
How can I adapt my code in a way, that I can individually color male foreigners, male nationals, female foreigners and female nationals, and make it so ,that male and female foreigners are at the bottom of their respective bar?