LSTM with Keras for mini-batch training and online testing

Question

I would like to implement an LSTM in Keras for streaming time-series prediction -- i.e., running online, getting one data point at a time. This is explained well here, but as one would assume, the training time for an online LSTM can be prohibitively slow. I would like to train my network on mini-batches, and test (run prediction) online. What is the best way to do this in Keras?

For example, a mini-batch could be a sequence of 1000 data values ([33, 34, 42, 33, 32, 33, 36, ... 24, 23]) that occur at consecutive time steps. To train the network I've specified an array X of shape (900, 100, 1), where there are 900 sequences of length 100, and an array y of shape (900, 1). E.g.,

X[0] = [[33], [34], [42], [33], ...]]
X[1] = [[34], [42], [33], [32], ...]]
...
X[999] = [..., [24]]

y[999] = [23]

So for each sequence X[i], there is a corresponding y[i] that represents the next value in the time-series -- what we want to predict.

In test I want to predict the next data values 1000 to 1999. I do this by feeding an array of shape (1, 100, 1) for each step from 1000 to 1999, where the model tries to predict the value at the next step.

Is this the recommended approach and setup for my problem? Enabling statefulness may be the way to go for a purely online implementation, but in Keras this requires a consistent batch_input_shape in training and testing, which would not work for my intent of training on mini-batches and then testing online. Or is there a way I can do this?

UPDATE: Trying to implement the network as @nemo recommended

I ran my own dataset on an example network from a blog post "Time Series Prediction with LSTM Recurrent Neural Networks in Python with Keras", and then tried implementing the prediction phase as a stateful network.

The model building and training is the same for both:

# Create and fit the LSTM network
numberOfEpochs = 10
look_back = 30
model = Sequential()
model.add(LSTM(4, input_dim=1, input_length=look_back))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(trainX, trainY, nb_epoch=numberOfEpochs, batch_size=1, verbose=2)
# trainX.shape = (6883, 30, 1)
# trainY.shape = (6883,)
# testX.shape = (3375, 30, 1)
# testY.shape = (3375,)

Batch prediction is done with:

trainPredict = model.predict(trainX, batch_size=batch_size)
testPredict = model.predict(testX, batch_size=batch_size)

To try a stateful prediction phase, I ran the same model setup and training as before, but then the following:

w = model.get_weights()
batch_size = 1
model = Sequential()
model.add(LSTM(4, batch_input_shape=(batch_size, look_back, 1), stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
trainPredictions, testPredictions = [], []
for trainSample in trainX:
    trainPredictions.append(model.predict(trainSample.reshape((1,look_back,1)), batch_size=batch_size))
trainPredict = numpy.concatenate(trainPredictions).ravel()
for testSample in testX:
    testPredictions.append(model.predict(testSample.reshape((1,look_back,1)), batch_size=batch_size))
testPredict = numpy.concatenate(testPredictions).ravel()

To inspect the results, the plots below show the actual (normalized) data in blue, the predictions on the training set in green, and the predictions on the test set in red.

The first figure is from using batch prediction, and the second from stateful. Any ideas what I'm doing incorrectly?

Use statefulness both during training and online testing. During training just reset the state after each batch. It also makes things easier since you'll structure all your data the same. — runDOSrun, Jul 22 '16 at 13:53
For anyone else wanting to understand stateful LSTM in keras, there's a tutorial here: http://philipperemy.github.io/keras-stateful-lstm/ — Jonno_FTW, Oct 26 '16 at 23:36
`model.set_weights(w)` is missing from the last code-block. Also, as _nemo_ pointed out in a comment, I eyeballed the second set of results and the ups & downs have a very high correlation with the blue lines. — Unknown, Jul 09 '20 at 21:19
@BoltzmannBrain: Is this code reliable? http://machinelearningmastery.com/time-series-prediction-lstm-recurrent-neural-networks-python-keras/ — just_learning, Jan 15 '22 at 12:19

score 11 · Accepted Answer · answered Jul 15 '16 at 20:19

11

If I understand you correctly you are asking if you can enable statefulness after training. This should be possible, yes. For example:

net = Dense(1)(SimpleRNN(stateful=False)(input))
model = Model(input=input, output=net)

model.fit(...)

w = model.get_weights()
net = Dense(1)(SimpleRNN(stateful=True)(input))
model = Model(input=input, output=net)
model.set_weights(w)

After that you can predict in a stateful way.

answered Jul 15 '16 at 20:19

nemo

55,207
13
135
135

Thank you. I've tried implementing this, but still have some issues. Please see the edited question. I really appreciate your help here! – BoltzmannBrain Jul 21 '16 at 02:49
2

Two things: you are never resetting the state of your recurrent layers and the data looks very similar but scaled differently; it may be a simple scaling problem of your plotting for example. In any case this question is not fit for SO anymore. It is probably best to do testing on your own, boiling things down and formulate shorter questions :) – nemo Jul 22 '16 at 13:51

LSTM with Keras for mini-batch training and online testing

1 Answers1

Linked