Edited Version to provide a better understanding of the problem:
Summary of the case:
I have a dataset which has several levels (line, main group, subgroup and products) which are allocated to different countries.
I want to build an aggregated forecast based on the top down approach to find the best aggregation level for my forecast.
Therefore, I build a tsibble which I want to aggregate. Using the aggregate_key()
can only build one parent/child relation (e.g. main group/product), but I have to consider all relationships to find the best aggregation level.
So my first question: Can I implement a loop which runs through all possible levels or do I have to run all levels individually ?
Second, I need to forecast in every country for it's own. While I can simply build a grouped structure within the bottom up approach (aggregate_key(subgroup/product)*country
) , this does not work out for the top down (as I show in my reprex below).
So my second question: Is it possible to do the grouping or do I have to forecast each country individually?
dat <- data.frame(id=1:n,
date=seq.Date(as.Date("2020-1-1"), as.Date("2020-1-20"), "day"),
country = rep(LETTERS[1:2], n/2),
line=sample(1:4,n, replace=TRUE),
main_group=sample(1:5, n, replace=TRUE),
subgroup=sample(1:10,n, replace=TRUE),
product=sample(1:20),
sales=sample(281:300))
dat_tsibble = dat %>%as_tsibble(key = c(product, country),
index = date)
dat_aggregated = dat_tsibble %>%
aggregate_key((subgroup/product)*country,
sales= sum(sales))
fit <- dat_aggregated %>%
model(base = ETS(sales)) %>%
reconcile(
td = top_down(base)
)
fc = fit %>% forecast(h="30 day")
I know that this df is not able to produce any forecast, but the error message is the same:
Thanks for any help!