0

This question concerns correct interpretation a ggplot. Specifically, I am attempting to create a three group relative frequency histogram (where the percentage is relative to each subgroup) and all groups are of unequal sizes. My excerpted data is this:

hist.df <- structure(list(Duration = c(25, 0, 181, 114, 99, 119, 30, 119, 
13, 60, 17, 189, 229, 182, 201, 175, 168, 45, 72, 176, 17, 11, 
23, 195, 174, 253, 8, 29, 11, 6, 178, 292, 36, 44, 90, 259, 81, 
57, 244, 117, 119, 29, 13, 15, 11, 25, 52, 136, 78, 218, 55, 
215, 6, 0, 97, 108, 39, 7, 107, 93, 201, 127, 71, 47, 149, 43, 
212, 0, 13, 55, 29, 128, 271, 186, 139, 179, 65, 286, 88, 181, 
34, 144, 158, 53, 115, 39, 98, 264, 83, 59, 57, 0, 3, 196, 24, 
41, 3, 8), status = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("cat", "dog", 
"rat"), class = "factor")), row.names = c(NA, -98L), groups = structure(list(
    status = structure(1:3, .Label = c("cat", "dog", "rat"), class = "factor"), 
    .rows = structure(list(1:28, 29:56, 57:98), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), row.names = c(NA, -3L), class = c("tbl_df", 
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"))

I am creating the histogram based on the discussion here with the following code:

ggplot(hist.df %>% 
         group_by(status),
       aes(x = Duration, y = stat(density*width), fill = status)) +
  geom_histogram(binwidth=20)

With the resulting output looking like this: enter image description here

My understanding is that each bin is a product of the bin width and the density estimate for the same bin, yielding the proportion relative to each 'status' group. If that's the case, shouldn't each group add up to 1? Just looking at it visually, it doesn't seem that it does. I'm concerned I'm missing something major. Thanks for any advice.

Flaunk
  • 13
  • 4

1 Answers1

0

This ended up being really straightforward. Despite there being only a single axis, except for the most bottom bar, the starting '0' value is actually the top of the bar immediately below it.

Flaunk
  • 13
  • 4