58

In Keras, to predict class of a datatest, the predict_classes() is used.

For example:

classes = model.predict_classes(X_test, batch_size=32)

My question is, I know the usage of batch_size in training, but why does it need a batch_size for prediction? how does it work?

malioboro
  • 3,097
  • 4
  • 35
  • 55

2 Answers2

47

Keras can predict multiple values at the same time, like if you input a vector of 100 elements, Keras can compute one prediction for each element, giving 100 outputs. This computation can also be done in batches, defined by the batch_size.

This is just in case you cannot fit all the data in the CPU/GPU RAM at the same time and batch processing is needed.

Dr. Snoopy
  • 55,122
  • 7
  • 121
  • 140
  • ooh, I see, so `batch_size` is to determine the size of the data that fit in CPU/GPU RAM, so the accuracy of the model shouldn't depend on the `predict_classes`'s `batch_size` value right? – malioboro Jun 20 '16 at 01:34
  • 4
    @malioboro That depends. If you, for example, use batch normalization which does not use estimated values during prediction (`mode=1` does that IIRC) then the batch size does indeed have an influence on the outcome. – nemo Jun 20 '16 at 02:13
  • @nemo that's right, I got it :) thanks for your explanation – malioboro Jun 20 '16 at 15:07
  • 3
    For those wondering what @nemo's comment is referring to, see the documentation of [latest Keras 1](https://faroit.github.io/keras-docs/1.2.2/layers/normalization/). From the Keras 2 release notes: "The mode argument of BatchNormalization has been removed; BatchNorm now only supports mode 0" – bers Apr 19 '18 at 13:49
  • 8
    If I want to classify say 10,000 images, is it fastest to pass all images to predict and use a batch_size=10,000? What's the best way to optimize speed of inference of a large number of images? – user3731622 Feb 04 '19 at 23:09
  • I am also interested in the comment above's answer. – b-fg Mar 11 '19 at 07:39
  • @b-fg Its not appropriate for ask a different question in comments, you should make a new question with all information. – Dr. Snoopy Mar 11 '19 at 11:14
  • 4
    However, the tensorflow documentation for predict says: "batch_size: Integer or None. Number of samples per gradient update." So they do talk about gradients they update, which is odd within predict... – frank Feb 23 '20 at 11:54
  • @frank this has been fixed: https://github.com/tensorflow/tensorflow/commit/4721480639b185cc9ce2eb1dbbcd25984a068453 – bers Oct 02 '20 at 08:03
2

The reason is the same , why you need batch size for training, because you cannot fit all data into one single batch

Similarly, if you have millions of data points to predict, it is obviously that you will not be able to pass at one go (single batch).

After all, training and prediction both have a forward pass on the batch data.

Hence, you need the batch size to control/limit the data point in a single batch and distribute it across multiple batches of prediction.

Dharman
  • 30,962
  • 25
  • 85
  • 135
Argho Chatterjee
  • 579
  • 2
  • 9
  • 26
  • Not being able to fit all the data into one batch is not the only reason why batches are used in training. Batches are also used to introduce stochasticity into the training process. – interoception Mar 24 '22 at 17:10