0

I am trying to visualize demographic data in a histogram.

I would like to have a histogram that has: -age groups (20-25, 25-30,...) -and for each age group two bars (female, male) with different colours (red, blue).

I tried many things, like creating separate dataframes for sex:

hist(dem.data.female$age,col="red"...)

hist(dem.data.male$age, col= "blue", add= T....)

I received a histogram but the bars overlayed each other... I also tried installing the easy.ggplot2 package but my R program seems not to have it.

enter image description here

Bilesh Ganguly
  • 3,792
  • 3
  • 36
  • 58
a.henrietty
  • 55
  • 1
  • 1
  • 8
  • 3
    Please edit to fit these [standards](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). As a rule of thumb if you can't obtain a result when you copy and paste your code into a fresh R session, then your question needs to be edited because users cannot help unless your question is reproducible. In the worst case scenario, the question is often closed. This is also unclear: **I also tried installing the easy.ggplot2 package but my R programm seems not to have it.** – NelsonGon Jul 14 '19 at 12:05

1 Answers1

0

Provide a reproducible data set. Here, there are 52 "random" observations from both genders with an additional variable to indicate the gender.

  set.seed(123)
  age <- sample(18:30, 52, T)
  sex <- sample(1:2, 52, T)
  sex <- factor(sex)
  grp <- cut(age, breaks = seq(18, 30, 2), include.lowest = TRUE)
  df <- data.frame(age, sex, grp)

I'm not sure about easy.ggplot2 but I know either ggplot2 or lattice can serve the needs of visualizing multivariate data as you have.

  library(lattice)
  library(ggplot2)

For lattice, aggregating the data first is a common strategy

  res <- aggregate(age ~ grp + sex, df, length)
  barchart(age ~ grp, groups = sex, res)

simple lattice solution

For ggplot2, aggregation is built-in to several of the "geoms". The first used the continuous variable age for the x-axis.

  ggplot(df, aes(x = age, fill = sex))
  + geom_histogram(bins = 6, position = "dodge")

ggplot by histogram

  ggplot(df, aes(x = grp, fill = sex))
  + geom_bar(position = "dodge")

ggplot by geom_bar

The second used the discrete (factor) variable grp for the x-axis. Note the subtle differences in the two ggplot functions.

David O
  • 803
  • 4
  • 10