R's decompose/stl functions unable to completely extract annual seasonality from daily time series

Question

Context first, questions at the bottom.

I have 10 years of daily precipitation data that exhibits an annual seasonality, which I am trying to model using ARMA methods and then forecast. Data here, time series object creation below.

I am aware that common R packages and functions struggle with daily time series. For example, the arima() function of Forecast does not accept frequencies above 350, and ts() does not accept non-integer values for frequencies (both of which would be useful as the average number of days in a year is 365.25).

Apparently, the msts() function of Forecast can accept a non-integer values for the seasonal.perdiods argument, so I have created my time series like so:

the_ts <- msts(data$PRCP, start=c(2007, 10), end=c(2017, 9),  seasonal.periods=c(365.25))

Time Series:
Start = c(2007, 10) 
End = c(2017, 9) 
Frequency = 365 
   [1] 0.09 0.75 1.63 0.06 0.36 0.63 0.76 0.43 0.13 0.00 0.00 0.02 0.31 1.80 0.03 0.19 0.25 0.01 0.00 0.52 0.01 0.00 0.00 0.00 0.00 ... etc

--

plot(the_ts)

This series is stationary, so it does not need to be differenced.

I then wish to decompose the series, to extract the seasonality and trend, leaving the residuals left over, which if successful, should approximate white noise.

Below is the plotted decomposition. I'v also tried to use decompose() and played with numerous parameter adjustments.

the_ts_decomp = stl(the_ts, s.window = "periodic")
plot(the_ts_decomp)

Apparently, there is seasonality of some type left over in the residuals, as the shape of the seasonal trend in the data is paralleled in the residuals. Let's remove the identified seasonal component and inspect the supposedly deseasonalized data:

the_ts_deseasonal <- seasadj(the_ts_decomp)
plot(the_ts_deseasonal)

Still looks pretty seasonal to me. ACF and PACFs (not pictured) confirm autocorrelation is occurring.

Acf(the_ts_deseasonal, lag.max=1000)
Pacf(the_ts_deseasonal, lag.max=1000)

Apparently it is possible to account for the seasonal component of the series when forecasting, by passing a fourier transform into the ARMA model as an exogenous covariate, as described by the author of the Forecast package here and here, but I am unclear on exactly how this relates to the need to decompose and deseasonalize the series before fitting a model.

I am able to produce the following forecast using that method based on the above seemingly insufficiently decomposed data. The forecast doesn't seem like a home run, and the residuals confirm something is off:

the_ts.fit <- auto.arima(the_ts, seasonal=FALSE, xreg=fourier(the_ts, K=5))
plot(forecast(the_ts.fit, h=365, xreg=fourier(the_ts, K=5, h=365)))
tsdisplay(residuals(the_ts.fit), lag.max=1000)

I don't understand Hyndman's exreg=fourier solution very well and it could be that I am missing how to correctly apply it to my circumstance.

Question 1: Don't I need to be able to decompose the data down to trendless, seasonless white noise, in order to reconstruct it for the forecast? What about when using the exreg=fourier solution?

Question 2: Why might the above code be failing to extract the seasonal component of the series, and how could I fix it?

Question 3: What package, function, or technique can use to specify an annual seasonality of 365.25?

These sound much more like statistical methods issues than programming issues. I'd suggest moving to stats.stackexchange. — Gregor Thomas, Nov 06 '18 at 19:42
As for tools, I think `prophet` is a popular alternative to `forecast`. You can check out the forecasting section of [CRAN's Timeseries Task View](https://cran.r-project.org/web/views/TimeSeries.html). — Gregor Thomas, Nov 06 '18 at 19:44

score 2 · Accepted Answer · answered Jan 06 '19 at 09:41

I am gonna try to answer the questions:

Question 1: Don't I need to be able to decompose the data down to trendless, seasonless white noise, in order to reconstruct it for the forecast? What about when using the exreg=fourier solution?

The fourier solution takes into account deterministic seasonality, i.e. it assumes that the seasonal pattern does not change over time. In general I would not say that a trend and seasonal adjusted series does need to be white noise, because there may be some short-term patterns that remain in the series.

Question 2: Why might the above code be failing to extract the seasonal component of the series, and how could I fix it?

When you specify s.window = "periodic" in the stl function, you basically assume that the seasonal pattern does not change over time. This can be one source of the problem.

Question 3: What package, function, or technique can use to specify an annual seasonality of 365.25?

The dsa function in the dsa package, for which you need to transform the time series into the xts format. For example like this:

data = rnorm(365.25*10, 100, 1)
data_xts <- xts::xts(data, seq.Date(as.Date("2007-01-01"), by="days", length.out = length(data)))

sa = dsa::dsa(data_xts, fourier_number = 24) # the fourier_number is used to model monthly recurring seasonal patterns in the regARIMA part

data_adjusted <- sa$output[,1]

R's decompose/stl functions unable to completely extract annual seasonality from daily time series

1 Answers1