Look at the data and graph in this example:
df <- data.frame(x = round(rnorm(10000, mean=100, sd=15)))
df$x <- ifelse(df$x < 50, 50, df$x)
df$x <- ifelse(df$x > 150, 150, df$x)
library(ggplot2)
ggplot(df) +
aes(x = x) +
geom_histogram(aes(y = ..density..),
binwidth = 10,
fill="#69b3a2",
color="#e9ecef", alpha=0.9) +
stat_function(fun = dnorm, args = list(mean = mean(df$x),
sd = sd(df$x)))
The resulting graph is:
Note that the histogram goes outside the data bounds. The data is explicitly set to be limited to the 50 to 150 range, but the histogram seems to represent data from 45 to 155. In other words, the binning seems to be wrong. Also note that the normal curves stops at the correct limits.
Is there a way to change the binning so that the bins go in the correct boundaries?
comment: I have found work-arounds such as this ggplot axis ticks fall at center of bin value rather than at the bin limits but if I understand correctly, the idea here is to move the data by half the bin width, but that would be wrong at the other side in this case. It would also ruin the normal curve)