3

I have the following problem in R. I would like to create a ts() object (i.e. a regular time series) from a irregular time series (i.e. a list of dates and data values).

You can reproduce the problem with the following data set and R script:

# dput(dd) result    
dd <- structure(list(NDVI = structure(c(14L, 4L, 11L, 12L, 20L, 17L, 
    5L, 7L, 21L, 23L, 25L, 19L, 15L, 9L, 3L, 24L, 2L, 6L, 22L, 16L, 
    13L, 18L, 10L, 8L, 1L), .Names = c("1", "2", "3", "4", "5", "6", 
    "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", 
    "18", "19", "20", "21", "22", "23", "24", "25"), .Label = c("0.4186", 
    "0.5452", "0.5915", "0.5956", "0.6010", "0.6860", "0.6966", "0.7159", 
    "0.7161", "0.7264", "0.7281", "0.7523", "0.7542", "0.7701", "0.7751", 
    "0.7810", "0.7933", "0.8075", "0.8113", "0.8148", "0.8207", "0.8302", 
    "0.8305", "0.8369", "0.9877"), class = "factor"), DATUM = structure(c(11005, 
    11021, 11037, 11085, 11101, 11117, 11133, 11149, 11165, 11181, 
    11197, 11213, 11229, 11245, 11261, 11277, 11293, 11309, 11323, 
    11339, 11355, 11371, 11387, 11403, 11419), class = "Date")), .Names = c("NDVI", 
    "DATUM"), row.names = c("1", "2", "3", "4", "5", "6", "7", "8", 
    "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", 
    "20", "21", "22", "23", "24", "25"), class = "data.frame")

require(zoo)
dd$DATUM <- as.Date(dd$DATUM,"A%Y%j") # Ayear,julianday
z <- zoo(dd$NDVI,dd$DATUM,frequency=23)
z  # this is a regular time series with a frequency=23 and start=c(2000,1)
# there are 5 measurements in 2000 (2 jan, 1 feb, 2 apr) for which no data is available 
# this should be marked as an NA is the final regular time series
ts.z <- as.ts(z,start=c(2000,1),frequency=23)

But this does not work, as I obtain a very long regular time series containing daily time steps. I would like to obtain a ts object with a frequency=23 correctly indicating the position for which data is not available as NA.

I have been trying everything based on the example listed here for yearly data Convert a irregular time series to a regular time series

but it does not work for data with a frequency of 23 (i.e. 23 values a year). I think I could solve it by avoiding to set dd$DATUM as.Date() but as an zoo object that can be ordered as a time series with 23 values a year.

Any ideas?

Thanks for your help

Community
  • 1
  • 1
Janvb
  • 1,290
  • 2
  • 16
  • 17

1 Answers1

8

23 does not evenly divide into the number of days in a year so you will have to synthesize your own time scale such that each year is divided into 23 equal pieces. Convert dd (the version that has "Date" class times) to zoo and create a new series based on a new scale made up of the year plus a fraction. Finally convert that to a ts series:

library(zoo)
z <- zoo(as.numeric(as.character(dd[[1]])), dd[[2]]) 
lt <- unclass(as.POSIXlt(time(z)))
yr <- lt$year + 1900
jul <- lt$yday
delta <- min(unlist(tapply(jul, yr, diff))) # 16
zz <- aggregate(z, yr + jul / delta / 23)

as.ts(zz)

giving:

Time Series:
Start = c(2000, 4) 
End = c(2001, 7) 
Frequency = 23 
 [1] 0.7701 0.5956 0.7281     NA     NA 0.7523 0.8148 0.7933 0.6010 0.6966
[11] 0.8207 0.8305 0.9877 0.8113 0.7751 0.7161 0.5915 0.8369 0.5452 0.6860
[21] 0.8302 0.7810 0.7542 0.8075 0.7264 0.7159 0.4186
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • 1
    Yes. This looks great but it is not yet working here since the variable zt is not defined. ps why do you divide by 16? thanks for your help. – Janvb Jan 27 '11 at 17:48
  • @Jan I'm sure he'll fix it, but my guess is zt<-time(z). Also, 16*23, which equals 368, appears to be the synthesized time scale for a year. – bill_080 Jan 27 '11 at 18:30
  • @Jan, As Bill pointed out zt is time(z) but I have simplified and cleaned up the code since then which has eliminated it. The 16 comes from the fact that `diff(time(z))` shows the points are 16 days or multiples of 16 days apart. – G. Grothendieck Jan 27 '11 at 18:35
  • 2
    Works great! Beautiful solution. Thanks heaps. The data is actually MODIS satellite data which is summarised (composited) into 16-day time steps. This is a smart solution to create a regular time series (ts class). Cheers – Janvb Jan 28 '11 at 08:10
  • Have added a calculation which gives `16` so that that value is better justified. – G. Grothendieck Jan 28 '11 at 18:49