2

I was wondering if it is possible to have more than one time series identifier column in the model? Let's assume I'd like to create a forecast at a product and store level (which the documentation suggests should be possible).

If I select product as the series identifier, the only options I have left for store is either a covariate or an attribute and neither is applicable in this scenario.

Would concatenating product and store and using the individual product and store code values for that concatenated ID as attributes be a solution? It doesn't feel right, but I can't see any other option - am I missing something?

Note: I understand that this feature of Vertex AI is currently in preview and that because of that the options may be limited.

ErrHuman
  • 335
  • 1
  • 13

2 Answers2

2

There isn't an alternate way to assign 2 or more Time Series Identifiers in the Forecasting Model on Vertex AI. The "Forecasting model" is in the "Preview" Product launch stage, as you are aware, with all consequences of that fact the options are limited. Please refer to this doc for more information about the best practices for data preparation to train the forecasting model.

As a workaround, the two columns can be concatenated and assigned a Time Series Identifier on that concatenated column, as you have mentioned in the question. This way, the concatenated column carries more contextual information into the training of the model.

Vishal K
  • 1,368
  • 1
  • 7
  • 15
2

Just to follow up on Vishal's (correct) answer in case someone is looking this up in the future.

Yes, concatenating is the only option for now as there can only be one time series identifier (I would hope this changes in the future). Having said that, I've experimented with adding the individual identifiers in the data as categorical attributes and it works actually pretty well. This way I have forecast generated at a product/store level, but I can aggregate all forecasts for individual products and the results are not much off from the models trained on aggregated data (obviously that would depend on the demand classification and selected optimisation method amongst other factors).

Also, an interesting observation. When you include things like product descriptions, you can classify them either as categorical or text. I wasn't able to find in the documentation if the model would only use unigrams (which is what the column statistics in the console would suggest) or a number of n-grams but it is definitely something you would want to experiment with with your data. My dataset was actually showing a better accuracy when the categorical classification was used, which is a bit counter-intuitive as it feels like redundant information, although it's hard to tell as the documentation isn't very detailed. It is likely to be specific to my data set, so as I said make sure you experiment with yours.

ErrHuman
  • 335
  • 1
  • 13