1

Using data UKDriverDeaths

Attempting to use Holt-Winters prediction function & ggplot().

Basically reproduce the data in ggplot (1) with confidence intervals (2).

This is the data:

data('UKDriverDeaths')    
past <- window(UKDriverDeaths, end = c(1982, 12))
hw <- HoltWinters(past)
pred <- predict(hw, n.ahead = 10)
plot(hw, pred, ylim = range(UKDriverDeaths))
lines(UKDriverDeaths)

This is solution (1) to creating it in ggplot():

library(xts)
ts_pred <- ts(c(hw$fitted[, 1], pred), start = 1970, frequency = 12)
df <- merge(as.xts(ts_pred), as.xts(UKDriverDeaths))
names(df) <- c("predicted", "actual")
ggplot(df, aes(x=as.POSIXct(index(df)))) + 
  geom_line(aes(y=predicted), col='red') + 
  geom_line(aes(y=actual), col='black') + 
  theme_bw() +
  geom_vline(xintercept=as.numeric(as.POSIXct("1982-12-01")), linetype="dashed") + 
  labs(title="Holt-Winters filtering\n", x="Time", y="Observed / Fitted") + 
  theme(plot.title = element_text(size=18, face="bold"))

I'm looking for confidence intervals (2) for holt-winters prediction.

David Arenburg
  • 91,361
  • 17
  • 137
  • 196
user3608523
  • 65
  • 1
  • 7
  • If you do `?predict.HoltWinters` the two arguments you need to add to get confidence intervals will be apparent. – MattBagg Jun 12 '14 at 18:48

1 Answers1

2

Typically you call confidence intervals for predictions "prediction intervals". The predict.HoltWinters function will give those to you if you ask for them with prediction.interval=T. So you can do

pred <- predict(hw, n.ahead = 10, prediction.interval = TRUE)

Now this will change the shape of the values returned. Rather than a simple vector, you get a matrix, so you'll need to adjust some of your other transformation code to deal with this. Try

ts_pred <- ts(rbind(cbind(hw$fitted[, 1],upr=NA,lwr=NA), pred), 
    start = 1970, frequency = 12)
df <- merge(as.xts(UKDriverDeaths), as.xts(ts_pred))
names(df)[1:2] <- c("actual", "predicted")

This attempts to make sure all the columns get correctly lined up and labeled between the observed values and the predicted values.

Now we can just add two lines to the plots

ggplot(df, aes(x=as.POSIXct(index(df)))) + 
  geom_line(aes(y=predicted), col='red') + 
  geom_line(aes(y=actual), col='black') + 
  geom_line(aes(y=upr), col='blue') + 
  geom_line(aes(y=lwr), col='blue') + 
  theme_bw() +
  geom_vline(xintercept=as.numeric(as.POSIXct("1982-12-01")), 
      linetype="dashed") + 
  labs(title="Holt-Winters filtering\n", x="Time", y="Observed / Fitted") + 
  theme(plot.title = element_text(size=18, face="bold"))

holt winters gg2 plot

David Arenburg
  • 91,361
  • 17
  • 137
  • 196
MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • 2
    (+1) For a shaded region around the prediction you can use `geom_ribbon(aes(ymin=lwr,ymax=upr),alpha=0.3, fill="blue")` – MattBagg Jun 12 '14 at 19:06