Looking to do imputation for missing cointegrated time series -- in this hypothetical case, imputing the time series for FB on the trailing and leading sides. Tried using the forecast
package along with imputeTS
package, but it only imputes a straight line.
library(reshape2)
library(data.table)
tickers <- c("MSFT", "SPY","FB")
dataEnv <- new.env()
getSymbols(tickers, from="2010-06-30", to="2015-06-30", env=dataEnv)
plist <- eapply(dataEnv, Ad)
pframe <- as.data.frame(do.call(merge, plist))
pframe <- as.data.table(pframe,keep.rownames = T)
## Remove 100 recent days for example
pframe$[enter image description here][1]FB.Adjusted[ ( nrow(pframe) - 150):nrow(pframe) ] <- NA
## Plot 3 series
meltdf <- melt(pframe,id="rn")
ggplot(meltdf,aes(x=rn, y=value, colour=variable, group=variable)) + geom_line()
Then I applied the below functions using interpolation routines. Problem is, it interpolates a straight line from the start and end points, instead of imputing correlated prices, accounting for FB's relationship to the two other series:
library(imputeTS)
pframeImp <- na.interpolation(pframe)
meltdf <- melt(pframeImp,id="rn")
ggplot(meltdf,aes(x=rn, y=value, colour=variable, group=variable)) + geom_line()
library(forecast)
pframeImp <- na.interp(pframe)
meltdf <- melt(pframeImp,id="rn")
ggplot(meltdf,aes(x=rn, y=value, colour=variable, group=variable)) + geom_line()
Would like to use some sort of multiple imputation method to impute the trailing and leading sides of FB, accounting for cointegration across each of the other respective series.