I'm trying to use LSTM to do store sales forecast. Here is how my raw data look like:
| Date | StoreID | Sales | Temperature | Open | StoreType |
|------------|---------|-------|-------------|---------|-----------|
| 01/01/2016 | 1 | 0 | 36 | 0 | 1 |
| 01/02/2016 | 1 | 10100 | 42 | 1 | 1 |
| ...
| 12/31/2016 | 1 | 14300 | 39 | 1 | 1 |
| 01/01/2016 | 2 | 25000 | 46 | 1 | 3 |
| 01/02/2016 | 2 | 23700 | 43 | 1 | 3 |
| ...
| 12/31/2016 | 2 | 20600 | 37 | 1 | 3 |
| ...
| 12/31/2016 | 10 | 19800 | 52 | 1 | 2 |
I need to forecast for the next 10 days' sales. In this example, I will need to forecast the store sales from 01-01-2017 to 01-10-2017. I know how to use other time series model or regression model to solve this problem, but I want to know if RNN-LSTM is a good candidate for it.
I started by taking only storeID=1 data to test the LSTM. If my data only have Date and Sales. I will construct my trainX and trainY in this way (please correct me if I'm wrong):
Window = 20
Horizon = 10
| trainX | trainY |
| [Yt-10, Yt-11, Yt-12,...,Yt-29] | [Yt, Yt-1, Yt-2,...,Yt-9] |
| [Yt-11, Yt-12, Yt-13,...,Yt-30] | [Yt-2, Yt-3, Yt-4,...,Yt-10] |
| [Yt-12, Yt-13, Yt-14,...,Yt-31] | [Yt-3, Yt-4, Yt-5,...,Yt-11] |
...
After reshaping the two
trainX.shape
(300, 1, 20)
trainY.shape
(300, 10)
Question1: In this case, [samples, time steps, features] = [300, 1, 20]. Is this right? Or should I construct the sample as [300, 20, 1] ?
Question2: I do want to use other information in the raw data like Temperature, StoreType, etc. How should I construct my input data for LSTM?
Question3: So far we only discussed 1 store forecast, if I want to forecast for all the stores, how should I construct my input data then?
Currently I'm flowing examples from here, but it seems not sufficient to cover the scenario that I have. I really appreciate for your help!