37

I was confused by this problem for several days...

My question is that why the training time has such massive difference between that I set the batch_size to be "1" and "20" for my generator.

If I set the batch_size to be 1, the training time of 1 epoch is approximately 180 ~ 200 sec. If I set the batch_size to be 20, the training time of 1 epoch is approximately 3000 ~ 3200 sec.

However, this horrible difference between these training times seems to be abnormal..., since it should be the reversed result: batch_size = 1, training time -> 3000 ~ 3200 sec. batch_size = 20, training time -> 180 ~ 200 sec.

The input to my generator is not the file path, but the numpy arrays which are already loaded into the memory via calling "np.load()". So I think the I/O trade-off issue doesn't exist.

I'm using Keras-2.0.3 and my backend is tensorflow-gpu 1.0.1

I have seen the update of this merged PR, but it seems that this change won't affect anything at all. (the usage is just the same with original one)

The link here is the gist of my self-defined generator and the part of my fit_generator.

TheTiger
  • 13,264
  • 3
  • 57
  • 82
HappyStorm
  • 473
  • 1
  • 4
  • 6

4 Answers4

47

When you use fit_generator, the number of samples processed for each epoch is batch_size * steps_per_epochs. From the Keras documentation for fit_generator: https://keras.io/models/sequential/

steps_per_epoch: Total number of steps (batches of samples) to yield from generator before declaring one epoch finished and starting the next epoch. It should typically be equal to the number of unique samples of your dataset divided by the batch size.

This is different from the behaviour of 'fit', where increasing batch_size typically speeds up things.

In conclusion, when you increase batch_size with fit_generator, you should decrease steps_per_epochs by the same factor, if you want training time to stay the same or lower.

pgrenholm
  • 787
  • 6
  • 7
  • Thanks for answering! I think my original is same with it (steps_per_epochs should equal to the # total samples / batch_size). But the merged pull request confused me a lot... OK, I got it finally! Thank you very much! – HappyStorm Apr 18 '17 at 06:23
  • 3
    I'm not seeing the batch_size specified anywhere in the fit_generator() routine. So where is it inferring the batch_size from? Are they defining batch_size as (Length of your generator ) / (steps_per_epoch)? They say it's "typically equal", but this would imply it's exactly equal. – Alex R. Aug 29 '17 at 18:25
  • quick question, why is it exactly that with 'fit' increasing batch_size speeds up things? It seems counterintuitive to me, I may just not understanding the difference between how 'fit' and 'fit_generator' operate. – Aristides Jun 04 '18 at 14:55
1

Let's clear it :

Assume you have a dataset with 8000 samples (rows of data) and you choose a batch_size = 32 and epochs = 25

This means that the dataset will be divided into (8000/32) = 250 batches, having 32 samples/rows in each batch. The model weights will be updated after each batch.

one epoch will train 250 batches or 250 updations to the model.

here steps_per_epoch = no.of batches

With 50 epochs, the model will pass through the whole dataset 50 times.

Ref - https://machinelearningmastery.com/difference-between-a-batch-and-an-epoch/

enter image description here

AbtabM
  • 129
  • 10
0

You should also take into account the following function parameters when working with fit_generator:

max_queue_size, use_multiprocessing and workers

max_queue_size - might cause to load more data than you actually expect, which depending on your generator code may do something unexpected or unnecessary which can slow down your execution times.

use_multiprocessing together with workers - might spin-up additional processes that would lead to additional work for serialization and interprocess communication. First you would get your data serialized using pickle, then you would send your data to that target processes, then you would do your processing inside those processes and then the whole communication procedure repeats backwards, you pickle results, and send them to the main process via RPC. In most cases it should be fast, but if you're processing dozens of gigabytes of data or have your generator implemented in sub-optimal fashion then you might get the slowdown you describe.

Lu4
  • 14,873
  • 15
  • 79
  • 132
-2

The whole thing is:

fit() works faster than fit_generator() since it can access data directly in memory.

fit() takes numpy arrays data into memory, while fit_generator() takes data from the sequence generator such as keras.utils.Sequence which works slower.

prosti
  • 42,291
  • 14
  • 186
  • 151
  • ```fit_generator()``` is used in order to achieve multiprocessing and the usage of multiple CPU-cores, which contradicts with your statement (see ```use_multiprocessing``` and ```workers```). – Markus May 16 '19 at 22:00