0

I'm forecasting hierarchical data with fable that has 2 levels of aggregation (but will have more in the future), and am having trouble knowing which predictions correspond to which series. Here is a simplified version of what I have:

# A fable: 7 x 6 [12M]
# Key:     type, name, .model [7]
  type         name         .model     date       value  .mean
  <chr*>       <chr*>       <chr>     <mth>      <dist>  <dbl>
1 x            x1           mint   2021 Jan N(20, 0.82)  19.9 
2 x            x2           mint   2021 Jan  N(20, 1.3)  19.9 
3 x            <aggregated> mint   2021 Jan    N(40, 1)  39.8 
4 y            y1           mint   2021 Jan N(9.7, 1.9)  9.73
5 y            y2           mint   2021 Jan N(9.9, 1.7)  9.92
6 y            <aggregated> mint   2021 Jan  N(20, 3.8)  19.6 
7 <aggregated> <aggregated> mint   2021 Jan  N(59, 5.9)  59.4 

Is there a way to rename the aggregated vectors as I am pivoting and aggregating the table? So it would look something like this:

# A fable: 7 x 6 [12M]
# Key:     type, name, .model [7]
  type         name         .model     date       value  .mean
  <chr*>       <chr*>       <chr>     <mth>      <dist>  <dbl>
1 x            x1           mint   2021 Jan N(20, 0.82)  19.9 
2 x            x2           mint   2021 Jan  N(20, 1.3)  19.9 
3 x            x            mint   2021 Jan    N(40, 1)  39.8 
4 y            y1           mint   2021 Jan N(9.7, 1.9)  9.73
5 y            y2           mint   2021 Jan N(9.9, 1.7)  9.92
6 y            y            mint   2021 Jan  N(20, 3.8)  19.6 
7 xy           xy           mint   2021 Jan  N(59, 5.9)  59.4 

I can do it manually for one level of aggregation by just renaming all aggregate vectors to what I want, but for two (or more) I'm not sure how to do it. I have tried using the is_aggregated() function but when I have 20 series at the bottom level it becomes very weird to try and find what corresponds to what.

Thanks so much!

Here's a repex

df <- tibble(
  date = seq(from = as.Date("2011/1/1"), to = as.Date("2020/1/1"), by = "year"),
  x1 = 11:20,
  x2 = x1 + rnorm(10),
  y1 = 1:10,
  y2 = y1 - rnorm(10)
)

df %>% 
  mutate(date = yearmonth(date)) %>% 
  as_tsibble(index = date) %>% 
  pivot_longer(!date) %>% 
  group_by(name) %>% 
  mutate(type = case_when(
    name %in% c("x1", "x2") ~ "x",
    name %in% c("y1", "y2") ~ "y")) %>% 
  aggregate_key((type / name), value = sum(value)) %>% 
  model(arima = ARIMA(value)) %>% 
  reconcile(mint = min_trace(arima, method = "mint_shrink")) %>% 
  forecast(h = 1) %>% 
  filter(.model == "mint") %>% 
  print(n = 7)
  • 1
    I'm not sure why you're trying to replace `` values with their parent's value. Doing so will change the structure of the aggregation in a way that no longer sums correctly - `y` will now have three children `y1`, `y2` and `y`. The `` value indicates that the level of the hierarchy has been summarised over. For instance `y/y1` and `y/y2` aggregate to give `y`. This method of using `` as an indicator value generalises nicely to 3 or more levels of aggregation. – Mitchell O'Hara-Wild Jul 26 '22 at 09:18
  • @MitchellO'Hara-Wild The reason I want to replace them is because I have to export the output to a separate program that doesn't like the angle brackets, and the aggregated values also have real world meaning and changing their name would make it easier to read (eg: xy means something specific to end users). Is there a way to replace `` without changing the other values of the table? Or is there a preferable solution? – Axel Torbenson Jul 26 '22 at 13:54
  • You can format the agg_vec columns and use the `agg_chr` argument to customise the text used to represent `` values. For example, `format(name, agg_chr = NA_character_)` to use NA instead (this requires fabletools dev version). Your naming convention of `xy` (and others) is not reversible, as it is unclear which character positions belong to each level of aggregation. You can produce similar strings with the `tidyr::unite()` function across your key columns (especially if `NA_character_` is your aggregation string). – Mitchell O'Hara-Wild Jul 27 '22 at 01:53

0 Answers0