I have the following time-series aggregated input for an LSTM-based model:
x(0): {y(0,0): {a(0,0), b(0,0)}, y(0,1): {a(0,1), b(0,1)}, ..., y(0,n): {a(0,n), b(0,n)}}
x(1): {y(1,0): {a(1,0), b(1,0)}, y(1,1): {a(1,1), b(1,1)}, ..., y(1,n): {a(1,n), b(1,n)}}
...
x(m): {y(m,0): {a(m,0), b(m,0)}, y(m,1): {a(m,1), b(m,1)}, ..., y(m,n): {a(m,n), b(m,n)}}
where x(m)
is a timestep, a(m,n)
and b(m,n)
are features aggregated by the non-temporal sequential key y(m,n)
which might be 0...1,000
.
Example:
0: {90: {4, 4.2}, 91: {6, 0.2}, 92: {1, 0.4}, 93: {12, 11.2}}
1: {103: {1, 0.2}}
2: {100: {3, 0.1}, 101: {0.4, 4}}
Where 90-93, 103, and 100-101 are aggregation keys.
How can I feed this kind of input to LSTM?
Another approach would be to use non-aggregated data. In that case, I'd get the proper input for LSTM. Example:
Aggregated input:
0: {100: {3, 0.1}, 101: {0.4, 4}}
Original input:
0: 100, 1, 0.05
1: 101, 0.2, 2
2: 100, 1, 0
3: 100, 1, 0.05
4: 101, 0.2, 2
But in that case, the aggregation would be lost, and the whole purpose of aggregation is to minimize the number of steps so that I get 500 timesteps instead of e.g. 40,000, which is impossible to feed to LSTM. If you have any ideas I'd appreciate it.