1

As the question says, I can only predict from my model with model.predict_on_batch(). Keras tries to concatenate everything together if I use model.predict() and that doesn't work.
For my application (a sequence to sequence model) it is faster to do grouping on the fly. But even if I had done it in Pandas and then only used Dataset for the padded batch, .predict() still shouldn't work?

If I can get predict_on_batch to work then that's what works. But I can only predict on the first batch of the Dataset. How do I get predictions for the rest? I can't loop over the Dataset, I can't consume it...

Here's a smaller code example. The group is the same as the labels but in the real world they are obviously two different things. There are 3 classes, maximum of 2 values in a sequence, 2 row of data per batch. There's a lot of comments and I nicked parts of the windowing from somewhere on StackOverflow. I hope it is fairly legible to most.

If you have any other suggestions on how to improve the code, please comment. But no, that's not what my model looks like at all. So suggestions for that part probably aren't helpful.

EDIT: Tensorflow version 2.1.0

import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Bidirectional, Masking, Input, Dense, GRU
import random
import numpy as np

random.seed(100)
# input data
feature = list(range(3, 14))
# shuffle data
random.shuffle(feature)
# make label from feature data, +1 because we are padding with zero
label = [feat // 5 +1 for feat in feature]
group = label[:]
# random.shuffle(group)
max_group = 2
batch_size = 2

print('Data:')
print(*zip(group, feature, label), sep='\n')

# make dataset from data arrays
ds = tf.data.Dataset.zip((tf.data.Dataset.from_tensor_slices({'group': group, 'feature': feature}), 
                          tf.data.Dataset.from_tensor_slices({'label': label})))
# group by window
ds = ds.apply(tf.data.experimental.group_by_window(
    # use feature as key (you may have to use tf.reshape(x['group'], []) instead of tf.cast)
    key_func=lambda x, y: tf.cast(x['group'], tf.int64),
    # convert each window to a batch
    reduce_func=lambda _, window: window.batch(max_group),
    # use batch size as window size
    window_size=max_group))

# shuffle at most 100k rows, but commented out because we don't want to predict on shuffled data
# ds = ds.shuffle(int(1e5)) 
ds = ds.padded_batch(batch_size,
                     padded_shapes=({s: (None,) for s in ['group', 'feature']}, 
                                    {s: (None,) for s in ['label']}))
# show dataset contents
print('Result:')
for element in ds:
    print(element)

# Keras matches the name in the input to the tensor names in the first part of ds
inp = Input(shape=(None,), name='feature')
# RNNs require an additional rank, even if it is a degenerate dimension
duck = tf.expand_dims(inp, axis=-1)
rnn = GRU(32, return_sequences=True)(duck)
# again Keras matches names
out = Dense(max(label)+1, activation='softmax', name='label')(rnn)
model = Model(inputs=inp, outputs=out)
model.summary()
model.compile(loss='sparse_categorical_crossentropy', metrics=['accuracy'])

model.fit(ds, epochs=3)

model.predict_on_batch(ds)
grofte
  • 1,839
  • 1
  • 16
  • 15
  • Your question is self-contradicting. You can feed any number of samples to `predict_on_batch`. So, why don't just make a bigger batch? – Zabir Al Nazi Apr 24 '20 at 13:26
  • If I make the batch size 200'000 then every single sequence will be padded to the length of the longest sequence. That's not really feasible. Plus, I will need the model to somehow run with a batch size of 200'000. The Keras default for .predict() is 32. – grofte Apr 24 '20 at 13:38
  • you can never run with such a big batch size unless you own a GPU cluster. – Zabir Al Nazi Apr 24 '20 at 13:41
  • Yes, I need to make multiple predictions. No, I can't really imagine that anyone would ever run any model with that large of a batch size. Especially for inference where the batch size is completely irrelevant. Yes, I have a couple of 2080s but that's not really enough. Right now I can only really imagine making a new dataset for each batch to predict on and that's definitely not the point of them. – grofte Apr 24 '20 at 13:49

1 Answers1

2

You can iterate over the dataset, like so, remembering what is "x" and what is "y" in typical notation:

for item in ds:
    xi, yi = item
    pi = model.predict_on_batch(xi)
    print(xi["group"].shape, pi.shape)

Of course, this predicts on each element individually. Otherwise you'd have to define the batches yourself, by batching matching shapes together, as the batch size itself is allowed to be variable.

komodovaran_
  • 1,940
  • 16
  • 44
  • Thank you so much! I really need to point out that an equivalent Python loop does not work: for ex in ds: pi = model.predict_on_batch([ex[0]]) print(ex[0]["group"].shape, pi.shape) I don't understand Dataset or what they are doing with it. The idea is really, really nice as a stand-in for Spark but it needs to work. – grofte Apr 27 '20 at 07:54
  • Is it possible to expand a bit on the last time, and maybe give an example? I don't understand the expected shape of the element to be passed to predict_on_batch()... For example this does not work for me: model.predict_on_batch([xi, xi, xi]) – Kurt Aug 11 '22 at 18:27