How to forecast time series, including a seasonality factor in R

Question

I have the following sample data:

library(data.table)
dt <- data.table('time' = c(1:10),
                    'units'= c(89496264,81820040,80960072,109164545,96226255,96270421,95694992,117509717,105134778,0))

I would like to make a forecast for the units at time = 10.

I can see that at time = 4*k, where k = 1,2,... there is a big increase of units, and I would like to include that as a seasonality factor.

How could I do that in R ? I have looked into the auto.arima but it seems that is it not the way to go.

Thanks

score 1 · Accepted Answer · answered Jul 31 '18 at 07:56

1

The Prophet API lets you compute easily the predictions, with an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality.

Quote from the link above:

It works best with time series that have strong seasonal effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well.

install.packages(‘prophet’)
library(prophet)
model <- prophet(dt) # see ?prophet for details, this builds the model (like auto.arima)
future <- make_future_dataframe(model, periods = 10) # creates the "future" data 
forecast <- predict(model, future) # predictions

tail(forecast)

Here the complete Example in R.

answered Jul 31 '18 at 07:56

RLave

8,144
3
21
37

prophet looks nice. thank you. Do you know if you can choose if you have additive seasonal effects or multiplicative ? – quant Jul 31 '18 at 13:37
1

yes, with `seasonality.mode`, see https://www.rdocumentation.org/packages/prophet/versions/0.3.0.1/topics/prophet – RLave Jul 31 '18 at 13:42
i had version `0.2` installed, and `seasonality.mode` wasn't there. in the newest version it is – quant Jul 31 '18 at 14:13
Does `prophet` also choose the 'best' model ? And if so under which criteria ? And also how can i see which one is chosen ? – quant Aug 01 '18 at 07:17
1

It chooses the best model with L-BFGS criteria, the ts is decomposed in 2/3 components that are estimated. It won't be the "best" model overall, but probably better than some more simplistic. – RLave Aug 01 '18 at 08:07
1

I'd recommend reading the paper cited https://peerj.com/preprints/3190.pdf – RLave Aug 01 '18 at 08:07
is there any way to see the form of closed form of the model that is chosen ? – quant Aug 01 '18 at 09:11
1

I don't know for sure, but I doubt there is. – RLave Aug 01 '18 at 09:12
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/177166/discussion-between-quant-and-rlave). – quant Aug 01 '18 at 09:22

score 1 · Answer 2 · answered Jul 31 '18 at 10:07

You are right, you can bet at 98.4% that there is a seasonality for t=4*k, and it value is +21108156. If the seasonality is assumed multiplicative rather than additive, you can get at 98.5%, that there is a seasonality and its value is +18.7%.

This is how I proceed, without using ready made package so that you can ask all kind of similar questions.

First introduce a new boolean variable dt$season = (dt$time %% 4)==0, which is true (i.e =1) for t=0,4,8,... and false (i.e. =0) else where. Then the function x~a*season+b is equal to a+b for t=0,4,8,... and b else where. In other words, a is the difference between the seasonal effect and the non-seasonal effect.

The linear regression fit <- lm(units ~ season, data= dt), gives you a=21108156, and summary(fit) tells you the std-error an a is 6697979, so that the observed value a=21108156 has a probability less than 0.0161 to appear in case it were 0. So, you can reasonably bet there is a 4 cycle seasonality with more than 1-0.0161=98.388% chances to be right.

If you assume the seasonality is multiplicative, use the same reasoning with the variable dt$mult = dt$units * dt$season. This time a * dt$mult + b is equal to a * dt$units + b when the seasonality apply and to b when it does not. So the seasonality brings a difference of a * dt$units, that is multiply the average by a=.1877=18.77%, with a significativity of 0.01471=1-98.5%.

That's how ready made packages works.

How to forecast time series, including a seasonality factor in R

2 Answers2

Linked