Train-test split in panel data

Asked Apr 14 '23 at 16:21

Active Apr 17 '23 at 20:27

Viewed 162 times

I have a panel dataset (multiple time series indexed on IDs and time) in which I want to perform multi-step forecasts (e.g. 5-steps forecasts). An example of the dataset (pandas dataframe) is the following:

IDs, time, f1,  f2, ...
1    0     4.1  50  ...
1    1     3.3  44  ...
1    2     2.6  11
2    0     2.1  79
2    1     4.9  56  ...
2    2     0.1  11
...  ...   ... ...  ...

However, i don't know how to train my models and specifically how should I split my data (train/test set). The end goal is to perform 5-steps forecasts given as inputs to the trained model x-length windows.

I was thinking to split the data as follows: 80% of the IDs would be in the train set and 20% on the test set and then to use sliding window for cross validation (e.g. using sktime's SlidingWindowSplitter).

Therefore, what would be a good training strategy to follow?

edited Apr 17 '23 at 20:27

asked Apr 14 '23 at 16:21

Mike7

can you mention what type of data you have wether it is a csv, also can you provide a sample dataset on which splitting can be performed which can help you with a better answer for your question. – Lav Sharma Apr 14 '23 at 16:53
The data are read from a csv file to a pandas dataframe. The dataset is similar to the one provided above. – Mike7 Apr 14 '23 at 19:10

Train-test split in panel data

0 Answers0