8

I do have problems with missing data, but I do not have NAs - otherwise would be easier to handle...

My data looks like this:

time, value
2012-11-30 10:28:00, 12.9
2012-11-30 10:29:00, 5.5
2012-11-30 10:30:00, 5.5
2012-11-30 10:31:00, 5.5
2012-11-30 10:32:00, 9
2012-11-30 10:35:00, 9
2012-11-30 10:36:00, 14.4
2012-11-30 10:38:00, 12.6

As you can see - there are missing some minute values - it is xts/zoo so I use as.POSIXct... to set the date as an index. How to add the missing timesteps to get a full ts? I want to fill the missing values with linear interpolation.

Thank you for your help!

Herr Student
  • 853
  • 14
  • 26
  • Got one possible answer: http://stackoverflow.com/questions/15114834/interpolate-zoo-object-with-missing-dates?rq=1 does work with zoo - but afterwards I can go back to xts. Still the problem with the "wrong" values - how to filter and set to NA? Thanks! – Herr Student Apr 15 '13 at 10:49
  • See also http://stackoverflow.com/questions/11897169/change-nas-to-interpolated-flat-bars – Darren Cook Apr 15 '13 at 12:38

2 Answers2

11

You can merge your data with a vector with all dates. After that you can use na.approx to fill in the blanks (NA in this case).

data1 <-read.table(text="time, value
2012-11-30-10:28:00, 12.9
2012-11-30-10:29:00, 5.5
2012-11-30-10:30:00, 5.5
2012-11-30-10:31:00, 5.5
2012-11-30-10:32:00, 9
2012-11-30-10:35:00, 9
2012-11-30-10:36:00, 14.4
2012-11-30-10:38:00, 12.6", header = TRUE, sep=",", as.is=TRUE)
times.init <-as.POSIXct(strptime(data1[,1], '%Y-%m-%d-%H:%M:%S'))
data2 <-zoo(data1[,2],times.init)
data3 <-merge(data2, zoo(, seq(min(times.init), max(times.init), "min")))
data4 <-na.approx(data3)
Pierre Lapointe
  • 16,017
  • 2
  • 43
  • 56
6

Thanks to P Lapointe for a cool answer. Also, if you also take advantage of the 'xout' argument in na.approx, you no longer need to do the merger:

data1 <-read.table(text="time, value
2012-11-30-10:28:00, 12.9
2012-11-30-10:29:00, 5.5
                   2012-11-30-10:30:00, 5.5
                   2012-11-30-10:31:00, 5.5
                   2012-11-30-10:32:00, 9
                   2012-11-30-10:35:00, 9
                   2012-11-30-10:36:00, 14.4
                   2012-11-30-10:38:00, 12.6", header = TRUE, sep=",", as.is=TRUE)
times.init <-as.POSIXct(strptime(data1[,1], '%Y-%m-%d-%H:%M:%S'))
data2 <-zoo(data1[,2],times.init)
data2
data4 <- na.approx(object=data2, 
          xout=seq(min(times.init), max(times.init), "min"))
Community
  • 1
  • 1
sfuj
  • 231
  • 7
  • 11