difference in predictions between model.predict() and model.predict_generator() in keras

Question

When I use model.predict_generator() on my test_set (images) I am getting a different prediction and when I use mode.predict() on the same test_Set I am getting a different set of predictions.

For using model.predict_generator I followed the below steps to create a generator:

Imagedatagenerator(no arguments here) and used flow_from_directory with shuffle = False.
There are no augmentations nor preprocessing of images(normalization,zero-centering etc) while training the model.

I am working on a binary classification problem involving dogs and cats (from kaggle).On the test set, I have 1000 cat images. and by using model.predict_generator() I am able to get 87% accuracy()i.e 870 images are classified correctly. But while using model.predict I am getting 83% accuracy.

This is confusing because both should give identical results right? Thanks in advance :)

are you using same models and can you share your code as well. — Sargam Modak, Sep 13 '17 at 06:10
Have you made sure that predict_generator() yields exactly one epoch? Since Keras 2 the generators are step-based (see fchollet's comment here https://github.com/fchollet/keras/issues/5818) so you might have a different number of samples in your predictions. You can also reset generators to make sure you always start with sample #0. — petezurich, Sep 13 '17 at 06:33
@petezurich I dont quite understand what you mean could you please provide a sample code? — Abhijit Balaji, Sep 13 '17 at 08:10
@AbhijitBalaji I think it would be easier if you provided your code. :0) Right now we can only guess whats wrong. Apart from that: You can reset a generator with `your_image_generator.reset()` before you start to predict. — petezurich, Sep 13 '17 at 13:22

score 2 · Accepted Answer · answered Sep 21 '17 at 05:19

2

@petezurich Thanks for your comment. Generator.reset() before model.predict_generator() and turning off the shuffle in predict_generator() fixed the problem

answered Sep 21 '17 at 05:19

Abhijit Balaji

1,870
4
17
40

Great. I am happy that you could solve your problem. – petezurich Sep 21 '17 at 05:39
@petezurich hey there seems to be a different problem with generators now. say suppose you have 7427 test images and you use a batch size of 16 in model.predict_generator. The output is only a vector of length 7424 and not 7427. Thus for it to work the batch_size must exactly divide the number of samples. thus for an input array of len 7427 you should use a batch size of 7 to get 7427 length array output from predict.generator() – Abhijit Balaji Oct 06 '17 at 09:34

difference in predictions between model.predict() and model.predict_generator() in keras

1 Answers1