2

I am new to R and have found this site extremely helpful, so this covers the second half of my question (one issue per post). Thank you for your assistance ahead of time.

Background: I was plotting historical data with multiple forecasts overlaid for visual accuracy checks. This worked great when displayed on an x axis of 'observations'. However, the data is more understandable when plotted with dates on the x axis, so I made it a time series using ts() and it plotted the time series data as expected. However, (A) it did not plot the forecast data on the time scale because they are not a time series; and (B) I was unsure how to force the x axis to plus 1 year to permit the forecast to display.

Question: (A) How do I restore the original time stamps to the forecast data? I know that I could manually recreate the time series, but this would be required in every iteration of the forecast. I have considered using forecast() instead of predict(), but the additional forecast iterations still have the same issue of not being a time series. Is there a simple way to restore the original time stamp to the forecast data?

  require(forecast)  [EDITED for clarity]

  data <- rep(cos(1:52*(3.1416/26)),5)*100+1000

  arima.ts <- ts(data,start=c(2009,1),frequency=52) #not plotted as time series

  # Create the current fit on data and predict one year out
  plot(arima.ts, type="l", xlab="weeks", ylab="counts",
  main="Overlay forecasts & actuals",
       sub="green=FIT(1-105,by 16) wks back & PREDICT(26) wks, blue=52 wks")
  ############## This plotted correctly as "Arima(data),..."
  arima.fit <- auto.arima(tail(arima.ts,156)) 
  arima.pred <- predict(arima.fit, n.ahead=52)
  lines(arima.pred$pred, col="blue")
  lines(arima.pred$pred+2*arima.pred$se, col="red")
  lines(arima.pred$pred-2*arima.pred$se, col="red")

  # Loop back and perform comparison plotting of forecast to actuals
  for (j in seq(1,105,by=16)) { 
    result <- tryCatch({
      ############## This plotted correctly as "Arima(head(data,-j),..."
      arima1.fit <- auto.arima(head(tail(arima.ts,-j),156))
      arima1.pred <- predict(arima1.fit, n.ahead=52)
      lines(arima1.pred$pred, col="green", lty=(numtests %% 6) + 1 )
    }, error = function(e) {return(e$message)}) ## Trap errors
  }
DouglasM
  • 133
  • 3
  • 9
  • I am continuing to investigate and resolve this, but hoping there is a simpler solution. Currently found a really good paper: [Washington University: Working with Time Series Data in R.pdf](http://faculty.washington.edu/ezivot/econ424/Working%20with%20Time%20Series%20Data%20in%20R.pdf) I will use 'start()' and 'end()' based on indexes into the arima.ts data, then use 'window()' instead of 'head(tail())' perform the forecast on the data subset, and use the previous 'end()' to restore the forecast into a ts object, then plot the object onto the existing 'plot()' using the 'lines()' command. – DouglasM Aug 09 '13 at 20:14

2 Answers2

2

The core question being addressed is "how to restore the original time stamps to the forecast data". What I have learned with trial and error is "configure, then never loose the time series attribute" by applying these steps:

1: Make a time series Use the ts() command and create a time series.
2: Subset a time series Use 'window()' to create a subset of the time series in 'for()' loop. Use 'start()' and 'end()' on the data to show the time axis positions.
3: Forecast a time series Use 'forecast()' or 'predict()' which operate on time series.
4: Plot a time series When you plot a time series, then the time axis will align correctly for additional data using the lines() command. {Plotting options are user preference.}

This causes the forecasts to be plotted over the historical data in the correct time axis location.

  require(forecast)     ### [EDITED for clarity]

  data <- rep(cos(1:52*(3.1416/26)),5)*100+1000
  a.ts <- ts(data,start=c(2009,1),frequency=52)

  ## Predict from previous '3' years then one year out & generate the plot
  a.win  <- window(a.ts,start=c(end(a.ts)[1]-3,end(a.ts)[2]),frequency=52)
  a.fit  <- auto.arima(a.win)  
  a.pred <- forecast(a.fit, h=52)
  plot(a.pred, type="l", xlab="weeks", ylab="counts",
       main="Overlay forecasts & actuals",
       sub="green=FIT(1-105,by 16) wks back & PREDICT(26) wks, blue=52 wks")

  for (j in seq(1, 90, by=8)) {   ## Loop to overlay early forecasts 
    result1 <- tryCatch({
      b.end   <- c(end(a.ts)[1],end(a.ts)[2]-j) ## Window the time series  
      b.start <- c(b.end[1]-3,b.end[2])
      b.window <- window(a.ts, start=b.start, end=b.end, frequency=52)

      b.fit  <-auto.arima(b.window) 
      b.pred <- forecast(b.fit, h=26)
      lines(b.pred$mean, col="green", lty="dashed" )
    }, error = function(e) {return(e$message)} ) ## Skip Errors
  }
DouglasM
  • 133
  • 3
  • 9
1
install.packages(c("forecast"))

library(forecast)

# Load your data
data <- c(11,53,50,53,57,69,70,65,64,66,66,64,61,65,69,61,67,71,74,71,77,75,85,88,95,
          93,96,89,95,98,110,134,127,132,107,94,79,72,68,72,70,66,62,62,60,59,61,67,
          74,87,112,134,51,50,38,40,44,54,52,51,48,50,49,49,48,57,52,53,50,50,55,50,
          55,60,65,67,75,66,65,65,69,72,93,137,125,110,93,72,61,55,51,52,50,46,46,45,
          48,44,45,53,55,65,89,112,38,7,39,35,37,41,51,53,57,52,57,51,52,49,48,48,51,
          54,48,50,50,53,56,64,71,74,66,69,71,75,84,93,107,111,112,90,75,62,53,51,52,
          51,49,48,49,52,50,50,59,58,69,95,148,49,83,40,40,40,53,57,54,52,56,53,55,
          55,51,54,45,49,46,52,49,50,57,58,63,73,66,63,72,72,71,77,105,97,104,85,73,
          66,55,52,50,52,48,48,46,48,53,49,58,56,72,84,124,76,4,40,39,36,38,48,55,49,
          51,48,46,46,47,44,44,45,43,48,46,45,50,50,56,62,53,62,63)

data2 <- c(rnorm(237))

# Make data a time series, starting Jan 2009
data.ts<-ts(data, start=c(2009,1),frequency=52)
data2.ts<-ts(data2, start=c(2009,1),frequency=52)

# Plot just the time series
plot(data.ts)

# Do the arima (and other functions you wish)
fit <- arima(data.ts)
fit2 <- arima(data2.ts)
# This part should solve your timeseries problem
# h=1 specifies 1 frequency (or in this case, a week) ahead
data.forecast <- forecast(fit, h=1)
data2.forecast <- forecast(fit2,h=1)

#plot the forecast data
plot(data.forecast)

# suppose you have another data set, surpress the first graph
par(new=T)

# plot the next graph
plot(data2.forecast)

Let me know if you need any further elaboration.

Irsal
  • 156
  • 6
  • I am using the forecast package, and will probably start with that as the base plot. However, the real question remains: Plotting multiple forecasts computed as repeated Arima() commands on subsets of the known data. The goal is forecasts that overlay the known data. – DouglasM Aug 09 '13 at 22:20
  • See my edits above. Essentially, you have to suppress your first timeseries forecast, then plot the next one. Does that answer your question? – Irsal Aug 11 '13 at 21:19
  • The 'par(new=T)' does permit plots to overlay, however, it kept overwriting with a different time axis and the plot became illegible. The core question of *how to restore the original time stamps to the forecast data* to permit overlaying multiple forecasts over the historical data along the time axis remains. I posted my updated solution as an answer in hopes that a more elegant solution will follow. – DouglasM Aug 13 '13 at 21:16