2

The goal I'm trying to achieve is to count how many events occur for the given day of week and hour

sample data:

Date            Time
2007-07-20   11:00:00
2007-01-05   06:15:00
2007-12-11   23:55:00
2007-12-11   23:55:00

to

Output:

 Day    Hour      Count
Friday  11:00 AM     1
Friday  6:00 AM      1 
Friday  11:00 PM     2

Any help is greatly appreciated!

Thank you!

thelatemail
  • 91,185
  • 12
  • 128
  • 188
Jamal Jenkins
  • 23
  • 1
  • 3

4 Answers4

1

I like to use a mixture of the chron and lubridate packages to work with dates, times and datetimes.

Here is some sample data:

library(dplyr)
library(chron)
library(lubridate)

df_foo = data_frame(
  date = seq.Date(from = as.Date("2016-01-01"), to = as.Date("2016-10-01"), by = "day"),
  times = chron::times(runif(n = 275, min = 0, max = 1))
)

which looks like this:

> df_foo
# A tibble: 275 x 2
         date       times
       <date> <S3: times>
1  2016-01-01    10:26:24
2  2016-01-02    21:47:00
3  2016-01-03    15:22:06
4  2016-01-04    19:47:55
5  2016-01-05    08:51:37
6  2016-01-06    14:27:47
7  2016-01-07    17:55:59
8  2016-01-08    07:45:36
9  2016-01-09    16:52:56
10 2016-01-10    07:11:58
# ... with 265 more rows

Then, you can group them by the day of the week and the hour of day:

df_foo %>% 
  group_by(
    `Day of Week` = lubridate::wday(date),
    `Hour of Day` = chron::hours(times)
  ) %>% 
  tally()

which results in:

> df_foo %>% 
+   group_by(
+     `Day of Week` = lubridate::wday(date),
+     `Hour of Day` = chron::hours(times)
+   ) %>% 
+   tally()
Source: local data frame [137 x 3]
Groups: Day of Week [?]

   Day of Week Hour of Day     n
         <dbl>       <dbl> <int>
1            1           0     4
2            1           1     2
3            1           4     3
4            1           5     5
5            1           6     1
6            1           7     3
7            1           8     2
8            1          10     2
9            1          11     3
10           1          14     1
# ... with 127 more rows
tchakravarty
  • 10,736
  • 12
  • 72
  • 116
1

Base R copes with this just fine too:

aggregate(
  count ~ wkday + hour,
  data=transform(dat,
                 wkday=format(as.Date(Date), "%A"),
                 hour=format(as.POSIXct(Time,format="%H:%M:%S"), "%I %p"),
                 count=1),
  FUN=sum
)
#    wkday  hour count
#1  Friday 06 AM     1
#2  Friday 11 AM     1
#3 Tuesday 11 PM     2

Where dat was:

dat <- structure(list(Date = c("2007-07-20", "2007-01-05", "2007-12-11", 
"2007-12-11"), Time = c("11:00:00", "06:15:00", "23:55:00", "23:55:00"
)), .Names = c("Date", "Time"), row.names = c(NA, -4L), class = "data.frame")
thelatemail
  • 91,185
  • 12
  • 128
  • 188
1

You can use the tidyverse and function count for this I think. Here is an explanation: http://dplyr.tidyverse.org/reference/tally.html

library(tidyverse)

dat %>% group_by(Date, Time) %>% count()

# A tibble: 3 x 3
# Groups:   Date, Time [3]
        Date     Time     n
       <chr>    <chr> <int>
1 2007-01-05 06:15:00     1
2 2007-07-20 11:00:00     1
3 2007-12-11 23:55:00     2

where:

dat <- structure(list(Date = c("2007-07-20", "2007-01-05", "2007-12-
11", 
"2007-12-11"), Time = c("11:00:00", "06:15:00", "23:55:00", "23:55:00"
)), .Names = c("Date", "Time"), row.names = c(NA, -4L), class = 
"data.frame")
Lars
  • 41
  • 3
0

This will also work. where df is the original data frame to start with:

df$Date <- as.Date(df$Date)
library(lubridate)
aggregate(Count~Day+Hour, data=data.frame(Day=wday(df$Date, label = TRUE, abbr = FALSE),
                 Hour=format(strptime(df$Time, format='%H:%M:%S'), '%I:00 %p'),Count=1), 
                 FUN='length')  

   Day    Hour     Count
1  Friday 06:00 AM     1
2  Friday 11:00 AM     1
3 Tuesday 11:00 PM     2
Sandipan Dey
  • 21,482
  • 2
  • 51
  • 63