How to reshape input for keras LSTM?

Question

I have a numpy array of some 5000 rows and 4 columns (temp, pressure, speed, cost). So this is of the shape (5000, 4). Each row is an observation at a regular interval this is the first time i'm doing time series prediction and I'm stuck on input shape. I'm trying to predict a value 1 timestep from the last data point. How do I reshape it into the 3D form for LSTM model in keras?

Also It will be much more helpful if a small sample program is written. There doesn't seem to be any example/tutorial where the input has more than one feature (and also not NLP).

Are all four observations input variables or is one of them an output? — DJK, Dec 22 '17 at 18:10
There are 4 columns (temp, pressure, speed, cost) and I want to predict a future value of one of them (mostly the first column, temp) using the past values of all the columns including the temp, if it makes sense. — user2559578, Dec 22 '17 at 18:32
Look at [this post](https://stackoverflow.com/questions/45764629/machine-learning-how-to-use-the-past-20-rows-as-an-input-for-x-for-each-y-value/45765082#45765082) first and see if it helps to understand how to reshape the data for an LSTM, if it does i could answer, but the questions are pretty similar — DJK, Dec 22 '17 at 19:01

score 2 · Answer 1 · answered Dec 22 '17 at 19:13

The first question you should ask yourself is :

What is the timescale in which the input features encode relevant information for the value you want to predict?

Let's call this timescale prediction_context.

You can now create your dataset :

import numpy as np

recording_length = 5000
n_features = 4
prediction_context = 10  # Change here
# The data you already have
X_data = np.random.random((recording_length, n_features))
to_predict = np.random.random((5000,1))
# Make lists of training examples
X_in = []
Y_out = []
# Append examples to the lists (input and expected output)
for i in range(recording_length - prediction_context):
    X_in.append(X_data[i:i+prediction_context,:])
    Y_out.append(to_predict[i+prediction_context])

# Convert them to numpy array
X_train = np.array(X_in)
Y_train = np.array(Y_out)

At the end :
X_train.shape = (recording_length - prediction_context, prediction_context, n_features)
So you will need to make a trade-off between the length of your prediction context and the number of examples you will have to train your network.

The data points (instances) are observed at 15 minute intervals, if that's what you mean by timescale? And the y_data is basically one of the columns in the dataset. I have four columns and I want to predict the future values of one of the columns (say, temp) using the past values of that (temp) column and also other columns(pressure, speed, cost). — user2559578, Dec 22 '17 at 19:54
and I also want to understand how to reshape the 2D array to 3D for LSTM. — user2559578, Dec 22 '17 at 20:30

How to reshape input for keras LSTM?

1 Answers1