1

I was given this dataframe, which is at the same time a frequency distribution, and was given a task of plotting a histogram of the age distribution of the whole population adding to the plot the male and female profile. What I need to achieve is a histogram like this one for example: Two-variable frequency bar plot with the male and female profile overlapping, but with the AgeClasses on the x axis. This is my code:

AgeClasses <- c('0-9','10-19','20-29','30-39','40-49', '50-59', '60-69','70-79','80-89', '90-99')
Frequencies <- c(1000,900,800,700,600,500,400,300,200,100)
SexRatioFM <- c(0.4,0.42,0.44,0.48,0.52,0.54,0.55,0.58,0.6,0.65)
df$Females <- c(SexRatioFM*Frequencies)
df$Males <- c(Frequencies-Females)

library(ggplot2)


ggplot(df) +
    geom_bar(mapping = aes(x = AgeClasses, y = Females), stat = "identity")

I would really appreciate your help in solving this task.

Ema Ilic
  • 11
  • 2

4 Answers4

0

This type of plot is a stacked bar plot. To produce it most easily with ggplot2, you need to transform your data into long format, so that one column has all the counts for both male and female, and another column contains a factor variable with the labels "Male" and "Female". You can do this using tidyr::pivot_longer:

library(ggplot2)
library(tidyr)

pivot_longer(df, cols = c(Females, Males)) %>%
  ggplot() +
  geom_col(mapping = aes(x = AgeClasses, y = value, fill = name)) +
  labs(x = "Age", y = "Count", fill = "Gender")

enter image description here

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
0

Try the following code:

AgeClasses <- c('0-9','10-19','20-29','30-39','40-49', '50-59', '60-69','70-79','80-89', '90-99')
Frequencies <- c(1000,900,800,700,600,500,400,300,200,100)
SexRatioFM <- c(0.4,0.42,0.44,0.48,0.52,0.54,0.55,0.58,0.6,0.65)
Females <- SexRatioFM*Frequencies
Males <- Frequencies-Females
df <- data.frame(AgeClasses=AgeClasses, Females=Females, Males=Males)
df <- reshape2::melt(df, id.vars = 'AgeClasses')
library(ggplot2)


ggplot(df) +
  geom_bar(mapping = aes(x = AgeClasses, y = value, fill=variable), stat = "identity")

Liman
  • 1,270
  • 6
  • 12
0

Allan is right, but to make the one in the plot, you need the bars superposed rather than stacked. I did it like this:


library(ggplot2)
library(dplyr)
AgeClasses <- c('0-9','10-19','20-29','30-39','40-49', '50-59', '60-69','70-79','80-89', '90-99')
Frequencies <- c(1000,900,800,700,600,500,400,300,200,100)
SexRatioFM <- c(0.4,0.42,0.44,0.48,0.52,0.54,0.55,0.58,0.6,0.65)
df <- tibble(
Females = c(SexRatioFM*Frequencies),
Males = c(Frequencies-Females), 
AgeClasses = AgeClasses, 
Frequencies=Frequencies, 
SexRatioFM = SexRatioFM)

df %>% select(AgeClasses, Males, Females) %>% 
  tidyr::pivot_longer(cols=c(Males, Females), names_to = "gender", values_to="val") %>% 
ggplot() +
  geom_bar(mapping = aes(x = AgeClasses, y=val, fill=gender, alpha=gender), stat="identity", position="identity") + 
  scale_alpha_manual(values=c(.5, .4))

enter image description here

DaveArmstrong
  • 18,377
  • 2
  • 13
  • 25
0

You'll need to revamp how you create your sample dataframe. Here's one way to do it:

df <- data.frame(
  AgeClasses = c('0-9','10-19','20-29','30-39','40-49', '50-59', '60-69','70-79','80-89', '90-99'),
  Frequencies = c(1000,900,800,700,600,500,400,300,200,100),
  SexRatioFM = c(0.4,0.42,0.44,0.48,0.52,0.54,0.55,0.58,0.6,0.65))

df$Females = df$SexRatioFM*df$Frequencies
df$Males = df$Frequencies-df$Females 

library(ggplot2)

ggplot(df) +
  geom_bar(mapping = aes(x = AgeClasses, y = Females), fill="purple", stat = "identity", alpha=.8) +
  geom_bar(mapping = aes(x = AgeClasses, y = Males), fill="navy blue", stat = "identity", alpha=.4)

And you should get something like this:

Example output

MH765
  • 390
  • 3
  • 11