The Problem
I am working on simulating a yearly series of hourly temperature data in R. The created time series should inherit characteristics of past years temperatures. Also, its important that the simulation inhabits the yearly and daily seasonality. However, the simulation results in a "explosive" process, thus the daily range of temperature increases over time which is not the case with past data.
I have made myself familiar with the principles of Hyndman & Athanasopoulos (2018). For the temperature simulation itself, I am following the instructions that I found on this mathworks-page:
My Steps
1. Get rid of yearly seasonality
I have a series of five years of hourly temperature data. I fitted a sin/cos-curve to the data to git rid of the yearly long term trend. Also I got rid of outliers with the tsoutliers()
function.
I subtracted this trend and am now trying to fit a seasonal (lag 24) Arima-model to the residuals.
2. Fit a seasonal arima model to residuals
I transformed the residuals to a time series with a frequency of 24 (due to the daily seasonality of temperature). I can see that the residuals are not stationary. According to what I have read so far, differencing should correct for stationarity. I used the auto.arima()
function, which correctly uses a seasonal lag. However, the AR1 coefficient is larger than 1, which (to my knowlegde) indicates that even the lagged residuals are not stationary. Forcing arima
to use two seasonal lags or setting auto.arima(..., stepwise=FALSE)
only marginally improved results.
Series: res_ts
ARIMA(5,0,0)(2,1,0)[24]
Coefficients:
ar1 ar2 ar3 ar4 ar5 sar1 sar2
1.8291 -1.1110 0.3322 -0.1057 0.0385 -0.5267 -0.2620
s.e. 0.0049 0.0101 0.0112 0.0097 0.0046 0.0047 0.0048
sigma^2 estimated as 0.134: log likelihood=-18127.68
AIC=36271.35 AICc=36271.36 BIC=36340.85
3. Simulate
In the last step, I am simulating the residual temperature with the simulate()
function and want to add this to the sin/cos-curve I fitted. But the simulated residuals show a kind of "explosive" process over time as their daily as well as their overall range increases (the real residuals range from about -20 to 20, while the simulated residuals range from -60 to 20). Thus, the simulation does not produce a realistic time series.
My Question
Does anyone know how I can improve my simulation? I know that Box-Cox-transformations can make a time series stationary but I don't know how to reverse the transformation for the simulation.
References
Hyndman, R.J., & Athanasopoulos, G. (2018) Forecasting: principles and practice, 2nd edition, OTexts: Melbourne, Australia. OTexts.com/fpp2. Accessed on 16.10.2019