1

I am trying to use forecast reconciliation in to improve forecasts at low, intermittent hierarchy levels. However, my computer runs out of memory for anything but trivial examples.

I am basing my analysis on example code from the presentation "Tidy Time Series & Forecasting in R : 10. Forecast Reconciliation" (bit.ly/fable2020, presented at rstudio::conf 2020) :

tourism %>%
  aggregate_key(Purpose * (State / Region), Trips = sum(Trips)) %>%
  model(ets = ETS(Tripsl)) %>%
  reconcile(ets_adjusted = min_trace(ets)) %>%
  forecast(h = 2)

This runs fine, even on my 8 GB RAM laptop.

However, our data has many more hieriarchy levels and groupings than this example, and the code is never able to complete. As a reproducible example I have added more three dummy levels to the "tsibble::tourism" dataset and include these in the aggregate_key. This runs out of memory even on my 50 GB RAM server!

tourism %>% mutate(Region1 = Region, Region2 = Region, Region3 = Region) %>% 
  aggregate_key(Purpose * (State / Region/ Region1 / Region2 / Region3), Trips = sum(Trips)) %>%
  model(ets = ETS(Trips)) %>%
  reconcile(ets_adjusted = min_trace(ets)) %>%
  forecast(h = 2)

Error: cannot allocate vector of size 929 Kb

Question Is there some way I can run this without reducing hierarchy levels and without running out of memory? Thanks!

Daniel B
  • 797
  • 4
  • 13
  • Lodged as a github issue: https://github.com/tidyverts/fabletools/issues/160 – Rob Hyndman Feb 24 '20 at 22:47
  • Thanks for bringing this up. As fable accepts arbitrary key aggregation structures, the algorithm to create the summation matrix is not optimal. I've began working on improving performance, and you can follow the progress here: https://github.com/tidyverts/fabletools/issues/160 – Mitchell O'Hara-Wild Feb 26 '20 at 03:15

1 Answers1

3

Thanks for your interest in fable. In the current CRAN version of fabletools (0.1.2) reconciliation is experimental, and as part of we have prioritised interface design/experimentation over performance.

As part of this experimentation, we are trying to find new ways to flexibly identify the aggregation structure and build an appropriate summation matrix. Seemingly the current approach is not ideal for deep nesting of series as your example points out.

I've written an alternate algorithm which I think performs better in these circumstances, both in terms of time and space complexity. This should allow you to compute hierarchical forecasts without using too much memory.

Update: This change is now published in version 0.1.3 of fabletools.

  • Thanks for this update. I have sucessfully tested a reconciled forecast which previously failed with 4550 items organized into hierarchy with three levels. This completed in three hours with no RAM issues. Then I started a complete test with the same data but with"warehouse" as grouping level, which increases the number of leaf nodes to 12800, and a product hierarchy with five levels. This test is still running since four days(!), and RStudio allocates 42 GB RAM (of 50 available). Although performance is not great the memory usage has clearly improved so I consider this issue as solved. – Daniel B Mar 05 '20 at 13:24
  • Thanks for your testing. We will be doing our own testing of a similar scale in the coming months, so you can expect more performance improvements for reconciliation to come. – Mitchell O'Hara-Wild Mar 05 '20 at 14:43
  • Update : The complete test I mentioned above has now completed sucessfully after 6 days processing. – Daniel B Mar 09 '20 at 09:26