I'm trying to calculate 8-hour rolling means using a ddply + rollingMean command on a pollutant data frame that looks something like this:
df1
date co code
2000-01-17 01:00:00 0.97000 42
2000-01-17 02:00:00 0.97000 42
2000-01-17 03:00:00 0.98000 42
2000-01-17 04:00:00 0.98000 42
2000-02-04 08:00:00 0.70000 42
2000-02-04 09:00:00 1.40000 42
2000-02-04 10:00:00 1.51000 42
2000-02-04 11:00:00 1.49000 43
2000-02-04 12:00:00 1.98000 43
2000-02-04 15:00:00 1.61000 43
2000-02-04 16:00:00 1.88000 43
2000-02-04 17:00:00 1.64000 43
2000-02-04 18:00:00 1.62000 43
2000-02-04 19:00:00 2.05000 43`
As you can see, the time series isn't complete (that's why I'm using openair's rollingMean, which treats data according to a "date" column), and there's different station "codes" (that I separated using ddply because rollingMean doesn't work with more than one station).
However, when I use this code:
> pd<-ddply(df1,.(code),function(df){df<-rollingMean(df,pollutant="co",
width=8,new.name="rolling",data.thresh=75);return(df)})`
The return is:
Error: 'by' is NA
Can anyone help me with this error?
Thanks in advance.
PS: Using a similar "o3" data frame like this:
> head(var2)
date o3 codigo
2000-01-01 01:00:00 23.25 1
2000-01-01 02:00:00 20.08 1
2000-01-10 16:00:00 63.67 1
2000-01-10 17:00:00 80.64 1
2000-01-10 18:00:00 86.48 1
2000-01-10 19:00:00 61.48 1
and this command:
pd<-ddply(var2,.(codigo),function(df){df<-rollingMean(df,pollutant="o3",
width=8,new.name="medmov",data.thresh=75);return(df)})
the code works just fine, showing:
> head(pd)
date o3 codigo medmov
2000-01-01 01:00:00 23.25 1 NA
2000-01-01 02:00:00 20.08 1 NA
2000-01-01 03:00:00 22.31 1 NA
2000-01-01 04:00:00 23.02 1 22.1650
2000-01-01 05:00:00 12.40 1 20.2120
2000-01-01 06:00:00 11.67 1 16.2575