1

I tried to re-create this example from Andrew (answer 3 here Neural Network LSTM input shape from dataframe).

What I want to do is if I have time series data that looks like this: Date - Observation1 - Observation2 - Observation3 ... n I want to use for example Observation1 and Observation2 to predict Observation3.

Based on the example I linked before this works so far but the problem is that if I have too much datapoints then the script gets after a long time a KILL signal at this line:

padded_sequences = pad_sequences(df.cumulative_input_vectors.tolist(), 
                                 max_sequence_length).tolist()

I checked the memory usage and it just fills up until the memory and the swap is full and then it gets the script stops with KILL signal (I tried it also on a machine with 16GB of memory). Does anyone have an idea how I could avoid that or if there is a way to split up the data for the pad_seqences function?

Community
  • 1
  • 1
Michael
  • 53
  • 5
  • The link leads to a deleted answer. Please point the link to the relevant question/answer. It would be even better to describe your problem in more detail instead of just linking to the other question/answer. – nemo Jan 28 '17 at 04:06
  • why don't you use batches which fit in memory? – Iman Mirzadeh Jan 28 '17 at 07:23
  • thank you nemo, that was stupid of me, I edited it! – Michael Jan 28 '17 at 09:56
  • @ImanMirzadeh I dont think its the batch size because it already stops before at the pad_sequences() function... – Michael Jan 28 '17 at 09:57

0 Answers0