How to feed key-value features (aggregated data) to LSTM?

Question

I have the following time-series aggregated input for an LSTM-based model:

x(0): {y(0,0): {a(0,0), b(0,0)}, y(0,1): {a(0,1), b(0,1)}, ..., y(0,n): {a(0,n), b(0,n)}}
x(1): {y(1,0): {a(1,0), b(1,0)}, y(1,1): {a(1,1), b(1,1)}, ..., y(1,n): {a(1,n), b(1,n)}}
...
x(m): {y(m,0): {a(m,0), b(m,0)}, y(m,1): {a(m,1), b(m,1)}, ..., y(m,n): {a(m,n), b(m,n)}}

where x(m) is a timestep, a(m,n) and b(m,n) are features aggregated by the non-temporal sequential key y(m,n) which might be 0...1,000.

Example:

0: {90: {4, 4.2}, 91: {6, 0.2}, 92: {1, 0.4}, 93: {12, 11.2}}
1: {103: {1, 0.2}}
2: {100: {3, 0.1}, 101: {0.4, 4}}

Where 90-93, 103, and 100-101 are aggregation keys.

How can I feed this kind of input to LSTM?

Another approach would be to use non-aggregated data. In that case, I'd get the proper input for LSTM. Example:

Aggregated input:

0: {100: {3, 0.1}, 101: {0.4, 4}}

Original input:

0: 100, 1, 0.05
1: 101, 0.2, 2
2: 100, 1, 0
3: 100, 1, 0.05
4: 101, 0.2, 2

But in that case, the aggregation would be lost, and the whole purpose of aggregation is to minimize the number of steps so that I get 500 timesteps instead of e.g. 40,000, which is impossible to feed to LSTM. If you have any ideas I'd appreciate it.

Does non temporal key `y(m,n)` have arithmetic meaning and bounds to it? Or is it just random? — dumbPy, May 17 '20 at 23:47
@dumbPy `y: [0...10,000]`, it's sequential in nature in the original (before aggregation) representation. A simple example: I have a temperature and a few other parameters for every minute, but I need to represent the step daily (otherwise I'll have too many steps for LSTM). I can sacrifice those few other parameters by merging them using aggregation by temperature. — Maximus, May 18 '20 at 00:01
As a result, I have intact temperature data with aggregated other params by the temperature key: `0: {8: {...}, 9: {...}, 10: {...}}`, where 8-10 is the temperature range for a particular day (tomorrow it might be a bit warmer, e.g. 9-12 degrees celsius), and `{...}` are the aggregated other params. — Maximus, May 18 '20 at 00:01

How to feed key-value features (aggregated data) to LSTM?

0 Answers0