Why is does my Keras model only train on 1407 training instances instead of 45k?

Question

Epoch 2/100
**1407/1407** [==============================] - 17s 12ms/step - loss: 1.9419 - accuracy: 0.2907 - val_loss: 2.1100 - val_accuracy: 0.2406
Epoch 3/100

for some odd reason it says that it only trains on 1407 instances even though the shape of the training data that I pass says 45k

x_train.shape
=> (45000, 32, 32, 3)

Can you show how you set up your model, and the parameters you used when you call `.fit()`? From the documentation: "batch_size: Integer or None. Number of samples per gradient update. If unspecified, batch_size will default to 32" and what is 45000/32? — G. Anderson, Sep 28 '20 at 20:16

Gerry P · Answer 1 · 2020-09-29T00:47:31.963

The number 1407 does not refer to the number of samples, it refers to the steps per epoch. For example assume you have 1000 training samples. If you set the batch_size=100 then it takes 10 steps per epoch to go through your entire data set. If you did not specify a batch_size model.fit defaults it to 32. 45000/32=1406.25 so it rounds it up to 1407. 1407 X 32=45024 so for each epoch you go through your entire training set once plus 24 additional samples. For validation data it is best to go through the validation set only once per epoch. Therefore try to select the validation batch size such that validation_samples/validation_batch_size is an integer then specify that as the value of validation_steps in model.fit. Here is a handy little function that will determine the largest available batch size and number of steps where length is the number of samples in the data set, and b_max is the maximum batch size you will allow based on your memory capacity.

def get_bs(length,b_max):
    batch_size=sorted([int(length/n) for n in range(1,length+1) if length % n ==0 and length/n<=b_max],reverse=True)[0]  
    return batch_size,int(length/batch_size)
# example
batch_size, steps=get_bs(1000, 80)
print (batch_size, steps)
# results in  batch_size=50 and steps=20

This function is also useful to determine if length is a prime number because it will return a batch_size of 1 just make b_max=length-1.

Ok i understand it now. But why does this code I read online have 45k steps when the batch size isn't specified anywhere... Shouldn't it also be 1407k? ```Epoch 38/100 45000/45000 [==============================] - 15s 332us/sample - loss: 0.9966 - accuracy: 0.6496 - val_loss: 1.3946 - val_accuracy: 0.5340``` — Phil, Sep 30 '20 at 17:28

Why is does my Keras model only train on 1407 training instances instead of 45k?

1 Answers1