1

I am trying to fit some time series using the R packages tsibble and fable, the still-under-construction replacement for the redoubtable Rob Hyndman's forecast package. The series are all combined into one tsibble, which I then fit with ARIMA, a function which replaces, among other things, forecast::auto.arima.

I use map_at, first to iterate over all the elements except the Date, and then again to extract the model information from the models that have been fit to each series using fablelite::components. (A lot of the fable functions are really in fablelite).

This fails, apparently because components expects an object of class mdl_df and my model objects have class mdl_defn

Here is a toy example that (almost) reproduces the error:

library(tidyverse)
library(tsibble)
library(fable)
set.seed(1)
ar1  <-  arima.sim(model=list(ar=.6), n=10)
ma1 <- arima.sim(model=list(ma=0.4), n=10)
Date  <- c(ymd("2019-01-01"):ymd("2019-01-10"),  ymd("2019-01-01"):ymd("2019-01-10"))
tb <- tibble(Date, ar1, ma1)

# Fit the whole series
tb_all <- tb   %>% 
map_at(.at =  c("ar1", "ma1"), .f = ARIMA)
names(arima_all[2:3])<- c("ar1", "ma1")

# Extract model components
tb_components <- tb %>%  
  map_at(.at = c("ar1", "ma1"), 
         .f = fablelite::components)

Note that in this toy, like my real data, time is in 5-day weeks with missing weekends

In this toy example, the the error message says the components function rejects list elements on grounds there is no method for class ts. In my real case, which uses longer series and more of them, but is to my eye otherwise identical, elements are rejected because they are of class mdl_defn. Note that if I examine the 2nd and third elements of tb_all with str( ), they also display as of Classes 'mdl_defn', 'R6' Not sure where the ts in the error message comes from.

Steffen Moritz
  • 7,277
  • 11
  • 36
  • 55
andrewH
  • 2,281
  • 2
  • 22
  • 32

1 Answers1

3

Here is an example that hopefully does something like what you want.

First, you need to create a tsibble:

library(tidyverse)
library(tsibble)
library(fable)
library(lubridate)
set.seed(1)
ar1  <-  arima.sim(model=list(ar=.6), n=30)
ma1 <- arima.sim(model=list(ma=0.4), n=30)
Date  <- ymd(paste0("2019-01-",1:30))
tb <- bind_cols(Date=Date, ar1=ar1, ma1=ma1) %>%
  gather("Series", "value", -Date) %>%
  as_tsibble(index=Date, key=Series)
tb
#> # A tsibble: 60 x 3 [1D]
#> # Key:       Series [2]
#>    Date       Series   value
#>    <date>     <chr>    <dbl>
#>  1 2019-01-01 ar1    -2.07  
#>  2 2019-01-02 ar1    -0.118 
#>  3 2019-01-03 ar1    -0.116 
#>  4 2019-01-04 ar1    -0.0856
#>  5 2019-01-05 ar1     0.892 
#>  6 2019-01-06 ar1     1.36  
#>  7 2019-01-07 ar1     1.41  
#>  8 2019-01-08 ar1     1.76  
#>  9 2019-01-09 ar1     1.84  
#> 10 2019-01-10 ar1     1.18  
#> # … with 50 more rows

This contains two series: ar1 and ma1 over the same 30 days.

Next you can fit ARIMA models to both series in one simple function.

tb_all <- tb %>% model(arima = ARIMA(value))
tb_all
#> # A mable: 2 x 2
#> # Key:     Series [2]
#>   Series arima                 
#>   <chr>  <model>               
#> 1 ar1    <ARIMA(0,0,2)>        
#> 2 ma1    <ARIMA(0,0,0) w/ mean>

Finally, it is not clear what you are trying to extract using components(), but perhaps one of the following does what you want:

tidy(tb_all)
#> # A tibble: 3 x 7
#>   Series .model term     estimate std.error statistic  p.value
#>   <chr>  <chr>  <chr>       <dbl>     <dbl>     <dbl>    <dbl>
#> 1 ar1    arima  ma1         0.810     0.198      4.09 0.000332
#> 2 ar1    arima  ma2         0.340     0.181      1.88 0.0705  
#> 3 ma1    arima  constant    0.295     0.183      1.61 0.118
glance(tb_all)
#> # A tibble: 2 x 9
#>   Series .model sigma2 log_lik   AIC  AICc   BIC ar_roots  ma_roots 
#>   <chr>  <chr>   <dbl>   <dbl> <dbl> <dbl> <dbl> <list>    <list>   
#> 1 ar1    arima   0.695   -36.4  78.9  79.8  83.1 <cpl [0]> <cpl [2]>
#> 2 ma1    arima   1.04    -42.7  89.4  89.8  92.2 <cpl [0]> <cpl [0]>
augment(tb_all)
#> # A tsibble: 60 x 6 [1D]
#> # Key:       Series, .model [2]
#>    Series .model Date         value .fitted  .resid
#>    <chr>  <chr>  <date>       <dbl>   <dbl>   <dbl>
#>  1 ar1    arima  2019-01-01 -2.07    -0.515 -1.56  
#>  2 ar1    arima  2019-01-02 -0.118   -1.21   1.09  
#>  3 ar1    arima  2019-01-03 -0.116    0.511 -0.627 
#>  4 ar1    arima  2019-01-04 -0.0856  -0.155  0.0690
#>  5 ar1    arima  2019-01-05  0.892   -0.154  1.05  
#>  6 ar1    arima  2019-01-06  1.36     0.871  0.486 
#>  7 ar1    arima  2019-01-07  1.41     0.749  0.659 
#>  8 ar1    arima  2019-01-08  1.76     0.699  1.06  
#>  9 ar1    arima  2019-01-09  1.84     1.09   0.754 
#> 10 ar1    arima  2019-01-10  1.18     0.973  0.206 
#> # … with 50 more rows

To see model outputs in the traditional way, use report():

tb_all %>% filter(Series=='ar1') %>% report()
#> Series: value 
#> Model: ARIMA(0,0,2) 
#> 
#> Coefficients:
#>          ma1     ma2
#>       0.8102  0.3402
#> s.e.  0.1982  0.1809
#> 
#> sigma^2 estimated as 0.6952:  log likelihood=-36.43
#> AIC=78.86   AICc=79.78   BIC=83.06
tb_all %>% filter(Series=='ma1') %>% report()
#> Series: value 
#> Model: ARIMA(0,0,0) w/ mean 
#> 
#> Coefficients:
#>       constant
#>         0.2950
#> s.e.    0.1833
#> 
#> sigma^2 estimated as 1.042:  log likelihood=-42.68
#> AIC=89.36   AICc=89.81   BIC=92.17
Rob Hyndman
  • 30,301
  • 7
  • 73
  • 85
  • Spot on for how I was screwing up my tsibble. Of course the long formate is the tidy way ! I suspect that you have answered this question, but am still not entirely sure which of these outputs tells me what I want to know, which is the order of the fitted model plus the coefficients for each equation. That seems closest to the tb_all output plus the "tidy" results. But under tidy I am expecting two sets of coefficients and seeing three. Is it that the orders tell me that I should expect two coefficients for the ar model and only 1 for the ma model? – andrewH Jul 26 '19 at 05:07
  • Maybe `report()` is what you want. I've updated my answer to include that. – Rob Hyndman Jul 26 '19 at 07:37