0

I have 2 data frames in R:

  1. tvNationalSale: each row is one TV ad placement
  2. workingNational: each row is total web sessions by minute

I want to add a calculated column to tvNationalSale that contains the sum of sessions in the 5 minutes prior to an ad showing. I am using the dplyr package for basic formatting.

> glimpse(tvNationalSale)
Observations: 1443
Variables:
$ Sort.Date        (fctr) 5/8/2015, 5/8/2015, 5/8/2015, 5/8/2015, 5/8/2015, 5/8/2015, 5/8/2015, 5/8...
$ Before.Time      (time) 2015-08-05 06:03:00, 2015-08-05 21:12:00, 2015-08-05 08:49:00, 2015-08-05...
$ Ad.Time          (time) 2015-08-05 06:08:00, 2015-08-05 21:17:00, 2015-08-05 08:54:00, 2015-08-05...
$ After.Time       (time) 2015-08-05 06:13:00, 2015-08-05 21:22:00, 2015-08-05 08:59:00, 2015-08-05...
$ Market.Long.Desc (fctr) National, National, National, National, National, National, National, Nat...
$ Campaign.Name    (fctr) europe-sale, europe-sale, europe-sale, europe-sale, europe-sale, europe-s...

> glimpse(workingNational)
Observations: 44616
Variables:
$ date     (date) 2015-05-01, 2015-05-01, 2015-05-01, 2015-05-01, 2015-05-01, 2015-05-01, 2015-05-0...
$ hour     (fctr) 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
$ minute   (fctr) 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,...
$ sessions (dbl) 161, 71, 65, 58, 63, 58, 56, 41, 56, 45, 58, 57, 37, 48, 37, 41, 43, 44, 36, 38, 4...
$ time     (chr) "01:01:00", "01:02:00", "01:03:00", "01:04:00", "01:05:00", "01:06:00", "01:07:00"...
$ datetime (time) 2015-05-01 01:01:00, 2015-05-01 01:02:00, 2015-05-01 01:03:00, 2015-05-01 01:04:0...

This example shows how to calculate period metrics within one data frame but I can't figure out how to calculate a similar metric from a separate data frame.

I tried this code which I think doesn't work because I am trying to reference a separate data frame in the mutate() command.

tvNationalSale <- tvNationalSale %>%
mutate(Before.Sessions=sum(filter(workingNational, datetime>=tvNationalSale$Before.Time & datetime<=tvNationalSale$Ad.Time)$sessions))

Any ideas on how to append calculated metrics from another data frame?

Community
  • 1
  • 1

1 Answers1

0

Assuming your workingNational data doesn't have gaps or other irregularities, you could look up the location of each ad time in workingNational and then just take the five entries leading up to that time:

indices <- match(tvNationalSale$Ad.Time, workingNational$datetime)
tvNationalSale$fiveMinutesBefore <- rowSums(sapply(1:5, function(x) workingNational$sessions[indices-x]))
head(tvNationalSale)
#               Ad.Time fiveMinutesBefore
# 1 2015-01-03 04:02:00              3126
# 2 2015-01-05 02:57:00              2221
# 3 2015-01-04 14:53:00              4269
# 4 2015-01-07 01:17:00              1916
# 5 2015-01-06 15:37:00              2484
# 6 2015-01-03 14:23:00              3092

Data:

set.seed(144)
workingNational=data.frame(datetime=seq(from=ISOdate(2015, 1, 1), to=ISOdate(2015, 1, 8), by="min"))
workingNational$sessions <- sample(1:1000, nrow(workingNational), replace=TRUE)
tvNationalSale=data.frame(Ad.Time=sample(workingNational$datetime, 100))
josliber
  • 43,891
  • 12
  • 98
  • 133