2

I would like to create a histogram in ggplot2 where each bin has the same number of points and all bins have the same area. In base R we could do the following:

set.seed(123)
x <- rnorm(100)
hist(x, breaks = quantile(x, 0:10 / 10))

When I try this in ggplot with scale_x_continuous and setting breaks like hist it returns the following:

library(ggplot2)
ggplot(data = data.frame(x), aes(x = x)) +
  geom_histogram(aes(y = after_stat(density)), bins = 10) +
  scale_x_continuous(breaks=quantile(x, 0:10 / 10))

Created on 2023-01-05 with reprex v2.0.2

Why is this returning a different output? So I was wondering if anyone knows how to create a histogram with equal area bins using ggplot like in the base option above?

Quinten
  • 35,235
  • 5
  • 20
  • 53

1 Answers1

2

If you wanted the same output as the base R hist, you can just extract the values from the object and draw it yourself.

set.seed(123)
x <- rnorm(100)
hh <- hist(x, breaks = quantile(x, 0:10 / 10))

data.frame(
  left=head(hh$breaks,-1), right=tail(hh$breaks, -1),
  height=hh$density
) |> 
  ggplot() + 
  aes(xmin=left, xmax=right, ymin=0, ymax=height) + 
  geom_rect(fill="lightgray", color="black")

enter image description here

MrFlick
  • 195,160
  • 17
  • 277
  • 295