I may don't understand your problem well. But as I can see in the picture you have more than one record per day and that is your main problem. solving this is not very hard. note that you can pass tensors as your X
and bind all you multiple records into a matrix.
For example, the simple tensor of your data will be this:
date : [Entity vector, Weight vector, …, Cost vector]
----------------------------------------------------------------------------
24/05/2019: [[A, B, C, D], [18.1, 22, 36, 46], …, [25, 24, 23, 50]]
25/05/2019: [[A, B, C], [43, 44, 35], …, [24, 0, 0]]
27/05/2019: [[A, B, C, D, F], [34, 46, 31, 27, 60], …, [27, 24, 23, 50, 35]]
NOTE: it may be necessary to have all of your vectors in the same length (for matrix mult). Then you can use "Padding". (it means just put 0
or -1
for missing entities. There is two possible senarios for you:
1) if you have finite entities, like having just A
to F
. You just add -1
for not peresent values. And no need for the first vector, becuse they are fixed and for exapmple index 1
always represents A
. The final tensors will be like this after padding:
date : "indexes are [A, B, C, D, E, F]" [Weight vector, …, Cost vector]
-----------------------------------------------------------------------
24/05/2019: [[18.1, 22, 36, 46, -1, -1], …, [25, 24, 23, 50, -1, -1]]
25/05/2019: [[43, 44, 35, -1, -1, -1], …, [24, 0, 0, -1, -1, -1]]
27/05/2019: [[34, 46, 31, 27, 60, -1], …, [27, 24, 23, 50, 35, -1]]
2) if you have infinite entities, I mean if your entities could be anything. then you have to keep the first vector and just pad all vectors to the maximum length vector. The final tensors will be like this in this case after padding (supposing 27/05/2019
has the max length):
date : [Entity vector, Weight vector, …, Cost vector]
--------------------------------------------------------------------------------
24/05/2019: [[A, B, C, D, -1], [18.1, 22, 36, 46, -1], …, [25, 24, 23, 50, -1]]
25/05/2019: [[A, B, C, -1, -1], [43, 44, 35, -1, -1], …, [24, 0, 0, -1, -1]]
27/05/2019: [[A, B, C, D, F], [34, 46, 31, 27, 60], …, [27, 24, 23, 50, 35]]
TIP: if your entities are more than one word, then you can use a hash to transfer them to just one number. (I don't recommend using a series of word-embeddings for this! this is too heavy for this 6-moths-data LSTM model, and you won't get a good result out of it.
Now, you feed these vectors into your LSTM. In the picture below, X0
and X1
and … are these tensors. (and you many expect the next day price from h
s).
