I have time series data for 3 months, in 15 minute intervals. (one day has 96 time slots) I have Temperature column[Temp] and Solar irradiance[SI](sun intensity) column. My model has to predict temperature on a 'day-ahead' basis for the entire day. ie I have to predict 96 time slots given data upto the previous day. When Im evaluating my model 'by myself' and splitting my data into train and test sets. How do i split them? Do i do an 80:20 split? but my test data will have more than one day's readings. Or do i do a (3 months - 1 day) --> as train, and test only on the last day?
Asked
Active
Viewed 91 times
1 Answers
1
Actually, that depends on your task. But it is highly recommended not to mix old/new data in the train set.
There are several links that you may find useful:
http://francescopochetti.com/pythonic-cross-validation-time-series-pandas-scikit-learn/
https://stats.stackexchange.com/questions/117350/how-to-split-dataset-for-time-series-prediction

avchauzov
- 1,007
- 1
- 8
- 13