0

I have this data set https://gist.github.com/natemiller/42eaf45747f31a6ccf9a

I'm trying to apply a rolling regression using the rollapply in the zoo package, following the examples in the rollapply help and keep getting what I imagine is a simple error, but one I haven't been able to work around.

If I load the above data as "dat" then I do this..

    dat$Date<-as.POSIXct(dat$Date, format="%m/%d/%y %H:%M")

    library(zoo)

    roll<-rollapply(dat, width = 6, FUN = function(d) coef(lm(Temp~Date, data=d)),  align="right")

and I get the error

    Error in eval(predvars, data, env) : invalid 'envir' argument

dat should be an appropriate input to lm, this lm works outside of rollapply, so the error arises in the rollapply itself. I assume its simple, but I'd appreciate help. Thanks

GSee
  • 48,880
  • 13
  • 125
  • 145
Nate Miller
  • 386
  • 5
  • 19

2 Answers2

1

First of all , I don't think that what you do make sense. You try to do a regression with 6 values.

The error occurs because you don't give a good environnment for lm. The d is a an atomic vector of length 6, or you need a data.frame with 2 columns Temp and date. For example , the first d is :

d
9.5 9.5 9.5 9.5 9.5 9.5 

Applying lm with this d , you reproduce the error:

lm(Temp~Date, data=d)
Error in eval(predvars, data, env) : 
  numeric 'envir' arg not of length one

you don't have the Date of the current roll window, you have just the values.

agstudy
  • 119,832
  • 17
  • 199
  • 261
  • Sorry, my understanding of rollapply was that function specified (function(d)) was applied to the dataframe (in this case, dat) that serves as the first input to rollapply. Thus d would be a dataframe. This is how the functions within plyr (such as ddply) seem to work, I guess I was mistaken about how the example code in rollapply help was working. As for the regression on 6 points, I have data measured every 10 minutes and want to calculate 60 minute averages throughout the day, so I'm stuck with a regression with 6 points. – Nate Miller Mar 22 '13 at 20:58
  • @rollaply is not like `plyr`. it is rather like `filter` function. Is is a a like moving window through your time series. width argument is the width of the window. I don't understand the second part of your comment. You have 5*24 values per day, and you would to transform them to what? – agstudy Mar 22 '13 at 21:06
  • 1
    Thanks. What I would like to do is roll through the time series with a window width of 60 min (representing 6 values, as they are measured every 10 min) and calculate the change in temperature (the slope) for each window. My goal then is to determine the maximum slope for each day and when in the day it occurred. – Nate Miller Mar 22 '13 at 21:22
  • try something like this `rollmax(diff(dat),k=6)` – agstudy Mar 22 '13 at 22:02
1

Try this:

library(zoo)
dat <- read.zoo("sampleTempData.csv", header = TRUE, sep = ",", 
    index = 2, tz = "", format = "%m/%d/%y %H:%S")

Seq <- zoo(seq_along(dat), time(dat))
coefs <- rollapply(Seq, 6, function(ix) coef(lm(dat ~ time(dat), subset = ix)))

ADDED: poster added to question so additional code here. Note that we are using POISIXct for date/times so time units associated with the coefs zoo object are in seconds regardless of the input format. At the end we convert from seconds to days. See ?aggregate.zoo

colnames(coefs) <- c("Intercept", "slope")
Seq.coefs <- zoo(1:nrow(coefs), time(coefs))
max.coefs <- function(ix) coefs[which.max(coefs[ix, 2]), ]
ag <- aggregate(Seq.coefs, as.Date, max.coefs)
transform(ag, slope = slope * 24 * 3600)
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • Thanks! I've changed the time format to %H:%M not %S. I'm actually looking for the maximum slope per day (which I didn't say in my post) so I extended your code. date<-index(coefs); data<-data.frame(date, coefs[,1],coefs[,2]); names(data)<-c("date","temp","slope"); library(lubridate);data$day<-floor_date(data$date,"day");library(plyr);ddply(data, .(day),function(d) d[which.max(d$slope),]); But the slopes don't look large enough, even if they are per minute. Temperature changes are at least 0.5 per hour, but the slopes (multiplied by 60) are only ~0.09-0.15. – Nate Miller Mar 25 '13 at 15:56