I have a large hourly time series data set showing temperatures at different times. There were a number of missing values (NA) in the series so I used linear interpolation to impute the missing values using the imputeTS package. Before the interpolation I was told to create a column for the imputed values as a zoo object. This replaced any NA temperatures with imputed ones.
I am doing heating degree day analysis which is the heating required to heat a building to room temperature. If the outside temperature is below 15.5 degrees then heating is required. I am looking to ignore (or set to NA) values above 15.5 and only focus on the temperatures below. I then would like to calculate the heating degree days which would be (15.5-Temp)*1/24 (24 hours in a day). This is usually simple however I am having trouble with the zoo object. Can anyone help??
An example of the data is:
DateTimes <- as.POSIXct(c("2009-01-01 00:00:00", "2009-01-01 01:00:00", "2009-01-01 02:00:00", "2009-01-01 03:00:00", "2009-01-01 04:00:00", "2009-01-01 05:00:00", "2009-01-01 06:00:00"))
MeanTemp <- c(0.8, 0.7, 0.7, NA, 0.8, 0.9, 1.1)
HourTemp <- data.frame(DateTimes, MeanTemp)
These are my imputation steps:
#Use linear interpolation to impute missing values
TempImp <- zoo(HourTemp$MeanTemp, HourTemp$DateTimes)
TempImp <- imputeTS::na.interpolation(TempImp, option = "linear")
#Add imputed values to data
as.data.frame(HourTemp)
HourTemp$airTempImp <- round(TempImp,1)
#Add imputed flag
HourTemp$Imputed <- ifelse(is.na(HourTemp$MeanTemp), "Imputed", "Observed")
HourTemp
The imputations worked successfully, replacing NA values with estimates but I cannot manipulate the zoo object 'airTempImp' to create a heating degree days column as specified in the opening paragraph.
I have tried using ifelse, ifelse.zoo, transform but none seem to be working!
Thanks!