0

I have made this question a reproducible example. Here is a portion of my data frame:

df <- structure(list(`Room Out Date` = c("2018-07-08", "2018-07-08", 
                "2018-07-08", "2018-07-09", "2018-07-09", "2018-07-09", "2018-07-09", 
                "2018-07-09", "2018-07-09", "2018-07-09", "2018-07-09", "2018-07-09", 
                "2018-07-10", "2018-07-10", "2018-07-10"), 
                `Room Out Time` = c("20:11:00", 
                "20:43:00", "22:28:00", "18:00:00", "18:32:00", "18:40:00", "18:59:00", 
                "19:16:00", "19:22:00", "19:38:00", "19:48:00", "21:24:00", "18:12:00", 
                "18:38:00", "18:40:00")), row.names = c(NA, -15L), 
                class = c("tbl_df", "tbl", "data.frame"))

I would like to create a histogram with times on the x-axis ranging from 17:30 to 07:30 (with a binwidth of 30 minutes) and count on the y-axis. I have tried converting the times using the chron library as well as with posixct, but ggplot doesn't seem to like either of those methods. Any help is much appreciated.

massisenergy
  • 1,764
  • 3
  • 14
  • 25
Andrea
  • 607
  • 8
  • 21

1 Answers1

1

Edited: Now collating all dates by half hour period

Try lubridate:

library(lubridate)

df %>%
  mutate(fakedate = ymd("2000-01-01")) %>%  # pretend all happen on same day
  mutate(fakedate_time = as_datetime(paste(fakedate, `Room Out Time`))) %>% 
  mutate(fakedate_time = as_datetime(ifelse(fakedate_time > as_datetime("2000-01-01 12:00:00"),
                                            fakedate_time,
                                            fakedate_time + days(1)))) %>%  # promote some to after midnight
  ggplot(aes(fakedate_time)) +
  geom_histogram(binwidth = 1800) +  # bins of 1800 seconds = 30 minutes
  xlim(as_datetime("2000-01-01 17:00:00"), as_datetime("2000-01-02 07:30:00"))

enter image description here

Andy Baxter
  • 5,833
  • 1
  • 8
  • 22
  • 1
    Hi, thank you for your reply. My large dataframe contains days for up to a year of data. I am trying to plot over the time window irrelevant of the day. How would I do this using your method? – Andrea Jul 11 '19 at 19:38
  • 1
    Ah I understand! I've edited code above to try another solution - basically gives every time record a 'fake date' to pretend they all happen on the same day, then adds 1 day to the records taking place after midnight, to give a continuous x axis. Does that produce something more helpful? – Andy Baxter Jul 11 '19 at 20:13
  • 1
    (I've also edited the time slots a little to show how post-midnight times work, try it on your data and see if it sticks at any further errors) – Andy Baxter Jul 11 '19 at 20:14
  • 1
    Hi Andrea, any luck? – Andy Baxter Jul 12 '19 at 10:21
  • Hi Andrew - Thanks so much for your help! That's perfect and exactly what I needed! – Andrea Jul 12 '19 at 13:55