Neural Network stuck at training

Question

Hello everyone I started training a network ana it got stuck, it did not finish the first epoch.

Here is the code I used:

top_model_weights_path = '/data/fc_model.h5'
img_width, img_height = 150, 150   
-train_data_dir = '/data/train'
validation_data_dir = '/data/validation'
nb_train_samples = 2000
nb_validation_samples = 800
epochs = 50
batch_size = 16
model = applications.VGG16(weights='imagenet', include_top=False, input_shape=(150, 150, 3))
print('Model loaded.')
top_model = Sequential()
top_model.add(Flatten(input_shape=model.output_shape[1:]))
top_model.add(Dense(256, activation='relu'))
top_model.add(Dropout(0.5))
top_model.add(Dense(1, activation='sigmoid'))
top_model.load_weights(top_model_weights_path)
model = Model(inputs= model.input, outputs= top_model(model.output))
for layer in model.layers[:25]:
    layer.trainable = False
model.compile(loss='binary_crossentropy',
              optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
              metrics=['accuracy'])
train_datagen = ImageDataGenerator(
    rescale=1. / 255,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
    validation_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary')
model.fit_generator(
    train_generator,
    samples_per_epoch=nb_train_samples,
    epochs=epochs,
    validation_data=validation_generator,
    nb_val_samples=nb_validation_samples)

I am using Transfer Learning. I followed this tutorial online : Tutorial

Please help thank you.

You don't need `samples_per_epoch` as flow from directory is an instance of Sequence which calculates that from number of images. Same for validation steps. — nuric, May 30 '18 at 17:48
also please. Please test the following: Remove the input_shape parameter on the model. And change the height and the width to 224 in stead of 115. I had strange errors changing the input size with vgg with keras. — elranu, May 30 '18 at 18:36
The screengrab you took shows that two more batches are needed, It seems that you just didnt not wait enough. — modesitt, May 30 '18 at 20:50
I did not get any error just stuck there for 5h while the 123 batches took about 25min. — Oussama, May 30 '18 at 21:13
I changed the height and with to the default size for VGG16 224 and removed the samples_per_epoch also and the model still gets stuck at batch 123. — Oussama, May 31 '18 at 01:09

Neural Network stuck at training

0 Answers0