0

My goal: is to write a loop that creates a unique data.frame for a multivariate time series that is split by a factor called hour. The data is daily and has values for demand and ad spend at each hour of the day. Each data.frame has1 date column, one demand column and 8 adspend columns representing the adspend for the current hour and the 7 previous hours. For example two loop cycles for I=3 and I=9 would produce: For 9am hour Data.Frame: the columns will be Date Demand9AM, AdSpend9AM...AdSpend2AM For 3am hour Data.Frame: the columns will be Date Demand3AM, AdSpend3AM...AdSpend9PM (yesterday) The trick is that earlier hours will have to pull some adspend from the previous days hours. A couple of solid coders on this site suggested I read about the "zoo" package. I did! So I have been able to take this problem to a solid place. Here is code for pseudo data that outputs a sequence of data.frames similar to what I need. Because I am a novice, I am not sure this is the most efficient way to create this loop. So my questions are:

  1. Is there a simpler way to create this loop?

  2. Is there a way to assign names to the variables within the loop?

  3. Is it possible to create the dataframes in a vectorized way?

The first question is far more important. Thank you

set.seed(1)
library(forecast)
library(lubridate)
library(zoo)
library(reshape)

set.seed(31)
foo <- function(myHour, myDate){
   rlnorm(1, meanlog=0,sdlog=1)*(myHour) + (150*myDate) 
}
Hour <- 1:24
Day <-1:90
dates <-seq(as.Date("2012-01-01"), as.Date("2012-3-30"), by = "day")
myData <- expand.grid( Day, Hour)
names(myData) <- c("Date","Hour")

myData$Adspend <- apply(myData, 1, function(x) foo(x[2], x[1]))
myData$Date <-dates

myData$Demand <-(rnorm(1,mean = 0, sd=1)+.75*myData$Adspend)
## ok, done with the fake data generation. 

myData


ADDate<-myData[,-4]
DemDate<-myData[,-3]
HourAD<-melt(ADDate, id=c("Date","Hour"), measured=c("Adspend"))
HourAD<-cast(HourAD,...~Hour)
ADHR<-zoo(HourAD,HourAD$Date)
HourDemand<-melt(DemDate, id=c("Date","Hour"), measured=c("Demand"))
HourDemand<-cast(HourDemand,...~Hour)
DEMHR<-zoo(HourDemand,HourDemand$Date)

DATASET <-vector("list",length(Hour))
for(i in seq_along(Hour)) { ifelse(i==1, DATASET[[i]]<-merge(DEMHR[,1],ADHR[,1],lag(ADHR[,18:24],-1),DATASET[[i]]<-merge(DEMHR[,i],ADHR[,i],DATASET[[i-1]]))}


DATASET <-vector("list",length(Hour))
for(i in seq_along(Hour)) { ifelse(i==1, DATASET[[i]]<-merge(DEMHR[,1],ADHR[,1],lag(ADHR[,18:24],-1)),DATASET[[i]]<-merge(DEMHR[,i],ADHR[,i],DATASET[[i-1]][,2:7]))}
Eric Blake
  • 37
  • 6
  • 1
    Hi Eric. Just a friendly suggestion here: Based on the questions you have been posting (and I guess removing?) over the past few days, it seems there might be an overall task which can lend itself to more effective ways of reaching the same end goal. Please feel free to point out what that goal is, either here or in the `r` chat room – Ricardo Saporta Oct 04 '13 at 18:07
  • Okay, thanks Ricardo. I need to read carefully the site instructions so that my intentions are well understood and I am able to give the proper credit for supportive help. I guess the overall task is to run separate regression models of current and past hourly ad spend on demand(quotes) for each hour of the day. So there will be 24 separate regression models. I hope that helps. thanks – Eric Blake Oct 04 '13 at 19:09
  • Jack, your work inspired what I was able to send in this latest post. so thank you. I am juts not very clear in articulating what I need. the code I sent actually generates it, just very clunky and ugly. – Eric Blake Oct 04 '13 at 19:13

0 Answers0