My goal: is to write a loop that creates a unique data.frame for a multivariate time series that is split by a factor called hour. The data is daily and has values for demand and ad spend at each hour of the day. Each data.frame has1 date column, one demand column and 8 adspend columns representing the adspend for the current hour and the 7 previous hours. For example two loop cycles for I=3 and I=9 would produce: For 9am hour Data.Frame: the columns will be Date Demand9AM, AdSpend9AM...AdSpend2AM For 3am hour Data.Frame: the columns will be Date Demand3AM, AdSpend3AM...AdSpend9PM (yesterday) The trick is that earlier hours will have to pull some adspend from the previous days hours. A couple of solid coders on this site suggested I read about the "zoo" package. I did! So I have been able to take this problem to a solid place. Here is code for pseudo data that outputs a sequence of data.frames similar to what I need. Because I am a novice, I am not sure this is the most efficient way to create this loop. So my questions are:
Is there a simpler way to create this loop?
Is there a way to assign names to the variables within the loop?
- Is it possible to create the dataframes in a vectorized way?
The first question is far more important. Thank you
set.seed(1)
library(forecast)
library(lubridate)
library(zoo)
library(reshape)
set.seed(31)
foo <- function(myHour, myDate){
rlnorm(1, meanlog=0,sdlog=1)*(myHour) + (150*myDate)
}
Hour <- 1:24
Day <-1:90
dates <-seq(as.Date("2012-01-01"), as.Date("2012-3-30"), by = "day")
myData <- expand.grid( Day, Hour)
names(myData) <- c("Date","Hour")
myData$Adspend <- apply(myData, 1, function(x) foo(x[2], x[1]))
myData$Date <-dates
myData$Demand <-(rnorm(1,mean = 0, sd=1)+.75*myData$Adspend)
## ok, done with the fake data generation.
myData
ADDate<-myData[,-4]
DemDate<-myData[,-3]
HourAD<-melt(ADDate, id=c("Date","Hour"), measured=c("Adspend"))
HourAD<-cast(HourAD,...~Hour)
ADHR<-zoo(HourAD,HourAD$Date)
HourDemand<-melt(DemDate, id=c("Date","Hour"), measured=c("Demand"))
HourDemand<-cast(HourDemand,...~Hour)
DEMHR<-zoo(HourDemand,HourDemand$Date)
DATASET <-vector("list",length(Hour))
for(i in seq_along(Hour)) { ifelse(i==1, DATASET[[i]]<-merge(DEMHR[,1],ADHR[,1],lag(ADHR[,18:24],-1),DATASET[[i]]<-merge(DEMHR[,i],ADHR[,i],DATASET[[i-1]]))}
DATASET <-vector("list",length(Hour))
for(i in seq_along(Hour)) { ifelse(i==1, DATASET[[i]]<-merge(DEMHR[,1],ADHR[,1],lag(ADHR[,18:24],-1)),DATASET[[i]]<-merge(DEMHR[,i],ADHR[,i],DATASET[[i-1]][,2:7]))}