0

I'm working through the book Forecasting Principles and Practice. Specifically, I'm working through the section on Useful Predictors which is here: https://otexts.com/fpp3/useful-predictors.html.

The text mentions intervention variables, but I'm not able to get spike or step variables to run. I've checked stackoverflow, and looked online, but found no examples. This code below returns a NULL model whether I use spike or step, any help getting intervention variables to run would be appreciated.

library(tidyverse)
library(fpp3)
fit_consBest <- us_change %>%
  model(
    lm = TSLM(Consumption ~ Income + Savings + Unemployment + trend() + season()),
    step = TSLM(formula = Consumption ~ Income + step(object = lm, scope = Income + Savings + Unemployment))
  )
# All of the reporting methods below return NULL models or errors:
report(fit_consBest)
fit_consBest %>% 
  select(step)
glance(fit_consBest)
Russ Conte
  • 124
  • 6

1 Answers1

1

The step() function does stepwise regression, it does not produce a step predictor.

Here is an example which uses a step predictor. In this case, the step occurs in the first quarter of 1975 (i.e., 0 before that and 1 afterwards).

library(fpp3)
#> ── Attaching packages ─────────────────────────────────────── fpp3 0.4.0.9000 ──
#> ✓ tibble      3.1.6          ✓ tsibble     1.1.1     
#> ✓ dplyr       1.0.7          ✓ tsibbledata 0.3.0.9000
#> ✓ tidyr       1.1.4          ✓ feasts      0.2.2.9000
#> ✓ lubridate   1.8.0          ✓ fable       0.3.1.9000
#> ✓ ggplot2     3.3.5
#> ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
#> x lubridate::date()    masks base::date()
#> x dplyr::filter()      masks stats::filter()
#> x tsibble::intersect() masks base::intersect()
#> x tsibble::interval()  masks lubridate::interval()
#> x dplyr::lag()         masks stats::lag()
#> x tsibble::setdiff()   masks base::setdiff()
#> x tsibble::union()     masks base::union()
fit_consBest <- us_change %>%
  model(
    lm = TSLM(Consumption ~ Income + Savings + Unemployment + trend() + season()),
    step = TSLM(Consumption ~ Income + (year(Quarter) >= 1975))
  )
glance(fit_consBest)
#> # A tibble: 2 × 15
#>   .model r_squared adj_r_squared sigma2 statistic  p_value    df log_lik   AIC
#>   <chr>      <dbl>         <dbl>  <dbl>     <dbl>    <dbl> <int>   <dbl> <dbl>
#> 1 lm         0.776         0.768 0.0944      94.1 2.61e-58     8   -43.2 -457.
#> 2 step       0.148         0.139 0.350       17.0 1.61e- 7     3  -176.  -203.
#> # … with 6 more variables: AICc <dbl>, BIC <dbl>, CV <dbl>, deviance <dbl>,
#> #   df.residual <int>, rank <int>
tidy(fit_consBest)
#> # A tibble: 11 × 6
#>    .model term                      estimate std.error statistic  p.value
#>    <chr>  <chr>                        <dbl>     <dbl>     <dbl>    <dbl>
#>  1 lm     (Intercept)                0.441    0.0650       6.79  1.38e-10
#>  2 lm     Income                     0.741    0.0397      18.7   7.36e-45
#>  3 lm     Savings                   -0.0528   0.00293    -18.0   5.96e-43
#>  4 lm     Unemployment              -0.343    0.0680      -5.04  1.06e- 6
#>  5 lm     trend()                   -0.00113  0.000391    -2.89  4.34e- 3
#>  6 lm     season()year2             -0.0760   0.0617      -1.23  2.19e- 1
#>  7 lm     season()year3             -0.0478   0.0626      -0.763 4.46e- 1
#>  8 lm     season()year4             -0.0865   0.0619      -1.40  1.64e- 1
#>  9 step   (Intercept)                0.485    0.138        3.52  5.45e- 4
#> 10 step   Income                     0.273    0.0469       5.82  2.39e- 8
#> 11 step   year(Quarter) >= 1975TRUE  0.0658   0.140        0.471 6.38e- 1

Created on 2021-12-04 by the reprex package (v2.0.1)

Rob Hyndman
  • 30,301
  • 7
  • 73
  • 85