4

I have a network model that is trained using batch training. Once it is trained, I want to predict the output for a single example.

Here is my model code:

model = Sequential()
model.add(Dense(32, batch_input_shape=(5, 1, 1)))
model.add(LSTM(16, stateful=True))
model.add(Dense(1, activation='linear'))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])

I have a sequence of single inputs to single outputs. I'm doing some test code to map characters to next characters (A->B, B->C, etc).

I create an input data of shape (15,1,1) and an output data of shape (15, 1) and call the function:

model.fit(x, y, nb_epoch=epochs, batch_size=5, shuffle=False, verbose=0)

The model trains, and now I want to take a single character and predict the next character (input A, it predicts B). I create an input of shape (1, 1, 1) and call:

pred = model.predict(x, batch_size=1, verbose=0)

This gives:

ValueError: Shape mismatch: x has 5 rows but z has 1 rows

I saw one solution was to add "dummy data" to your predict values, so the input shape for the prediction would be (5,1,1) with data [x 0 0 0 0] and you would just take the first element of the output as your value. However, this seems inefficient when dealing with larger batches.

I also tried to remove the batch size from the model creation, but I got the following message:

ValueError: If a RNN is stateful, a complete input_shape must be provided (including batch size).

Is there another way? Thanks for the help.

Lucas
  • 567
  • 1
  • 8
  • 21

2 Answers2

1

Currently (Keras v2.0.8) it takes a bit more effort to get predictions on single rows after training in batch.

Basically, the batch_size is fixed at training time, and has to be the same at prediction time.

The workaround right now is to take the weights from the trained model, and use those as the weights in a new model you've just created, which has a batch_size of 1.

The quick code for that is

model = create_model(batch_size=64)
mode.fit(X, y)
weights = model.get_weights()
single_item_model = create_model(batch_size=1)
single_item_model.set_weights(weights)
single_item_model.compile(compile_params)

Here's a blog post that goes into more depth: https://machinelearningmastery.com/use-different-batch-sizes-training-predicting-python-keras/

I've used this approach in the past to have multiple models at prediction time- one that makes predictions on big batches, one that makes predictions on small batches, and one that makes predictions on single items. Since batch predictions are much more efficient, this gives us the flexibility to take in any number of prediction rows (not just a number that is evenly divisible by batch_size), while still getting predictions pretty rapidly.

ClimbsRocks
  • 994
  • 13
  • 15
1

@ClimbsRocks showed a nice workaround. I cannot provide a "correct" answer in sense of "this is how Keras intends it to be done", but I can share another workaround which might help somebody depending on the use-case.

In this workaround I use predict_on_batch(). This method allows to pass a single sample out of a batch without throwing an error. Unfortunately, it returns a vector in the shape the target has according to the training-settings. However, each sample in the target yields then the prediction for your single sample.

You can access it like this:

to_predict = #Some single sample that would be part of a batch (has to have the right shape)#
model.predict_on_batch(to_predict)[0].flatten() #Flatten is optional

The result of the prediction is exactly the same as if you would pass an entire batch to predict().


Here some cod-example. The code is from my question which also deals with this issue (but in a sligthly different manner).

sequence_size      = 5
number_of_features = 1
input              = (sequence_size, number_of_features)
batch_size         = 2

model = Sequential()

#Of course you can replace the Gated Recurrent Unit with a LSTM-layer
model.add(GRU(100, return_sequences=True, activation='relu', input_shape=input, batch_size=2, name="GRU"))
model.add(GRU(1, return_sequences=True, activation='relu', input_shape=input, batch_size=batch_size, name="GRU2"))
model.compile(optimizer='adam', loss='mse')

model.summary()

#Summary-output:
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
GRU (GRU)                    (2, 5, 100)               30600     
_________________________________________________________________
GRU2 (GRU)                   (2, 5, 1)                 306       
=================================================================
Total params: 30,906
Trainable params: 30,906
Non-trainable params: 0


def generator(data, batch_size, sequence_size, num_features):
    """Simple generator"""
    while True:
        for i in range(len(data) - (sequence_size * batch_size + sequence_size) + 1):
            start = i
            end   = i + (sequence_size * batch_size)

            yield data[start : end].reshape(batch_size, sequence_size, num_features), \
                    data[end - ((sequence_size * batch_size) - sequence_size) : end + sequence_size].reshape(batch_size, sequence_size, num_features)

#Task: Predict the continuation of a linear range
data = np.arange(100)
hist = model.fit_generator(
                generator=generator(data, batch_size, sequence_size, num_features),
                steps_per_epoch=total_batches,
                epochs=200,
                shuffle=False
            )

to_predict = np.asarray([[np.asarray([x]) for x in range(95,100,1)]]) #Only single element of a batch
correct    = np.asarray([100,101,102,103,104])
print( model.predict_on_batch(to_predict)[0].flatten() )

#Output:
[ 99.92908 100.95854 102.32129 103.28584 104.20213 ]
Markus
  • 2,265
  • 5
  • 28
  • 54