Keras stops after fit_generator with no error

Question

I am playing around with Resnet architectures to compare performance with a CNN. I used this Resnet for my first test. I am reusing my code to load and prepare data for the network and it works just fine. The rest of the script seems to also work fine except when it gets to fit_generator. At fit_generator it pauses for a time then seems to exit where I have a print statement saying "what happened?" I am confused since I would expect an error message or the program to crash or something. I am using windows 10 running the latest version of anaconda. In my condo environment, I am using python 3.6, the latest version of Keras 2.3, the latest version of TensorFlow. I would appreciate any insights.

def batch_generator(X_train, Y_train):  
    while True:
        for fl, lb in zip(X_train, Y_train):
            sam, lam = get_IQsamples(fl, lb)
            max_iter = sam.shape[0]
            sample = []     # store all the generated data batches
            label = []   # store all the generated label batches

            i = 0
            for d, l in zip(sam, lam):
                sample.append(d)
                label.append(l)
                i += 1
                if i == max_iter:
                    break
            sample = np.asarray(sample)        
            label = np.asarray(label)
            yield sample, label


def residual_stack(x, f):
    
    # 1x1 conv linear
    x = Conv2D(f, (1, 1), strides=1, padding='same', data_format='channels_last')(x)
    x = Activation('linear')(x)


    # residual unit 1    
    x_shortcut = x
    x = Conv2D(f, (3, 2), strides=1, padding="same", data_format='channels_last')(x)
    x = Activation('relu')(x)
    x = Conv2D(f, 3, strides=1, padding="same", data_format='channels_last')(x)
    x = Activation('linear')(x)

    # add skip connection
    if x.shape[1:] == x_shortcut.shape[1:]:
      x = Add()([x, x_shortcut])

    else:
      raise Exception('Skip Connection Failure!')


    # residual unit 2    
    x_shortcut = x
    x = Conv2D(f, 3, strides=1, padding="same", data_format='channels_last')(x)
    x = Activation('relu')(x)
    x = Conv2D(f, 3, strides = 1, padding = "same", data_format='channels_last')(x)
    x = Activation('linear')(x)

    # add skip connection
    if x.shape[1:] == x_shortcut.shape[1:]:
      x = Add()([x, x_shortcut])

    else:
      raise Exception('Skip Connection Failure!')


    # max pooling layer
    x = MaxPooling2D(pool_size=2, strides=None, padding='valid', data_format='channels_last')(x)

    return x

.

Define ResNet Model

# define resnet model

def ResNet(input_shape, classes):   

    # create input tensor
    x_input = Input(input_shape)
    x = x_input

    # residual stack
    num_filters = 40
    x = residual_stack(x, num_filters)
    x = residual_stack(x, num_filters)
    x = residual_stack(x, num_filters)
    x = residual_stack(x, num_filters)
    x = residual_stack(x, num_filters)


    # output layer
    x = Flatten()(x)
    x = Dense(128, activation="selu", kernel_initializer="he_normal")(x)
    x = Dropout(.5)(x)
    x = Dense(128, activation="selu", kernel_initializer="he_normal")(x)
    x = Dropout(.5)(x)
    x = Dense(classes , activation='softmax', kernel_initializer = glorot_uniform(seed=0))(x)


    # Create model
    model = Model(inputs = x_input, outputs = x)
    model.summary()

    return model


model = ResNet((32,32,2),8)

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])


print('Load complete!')
print('\n')


steps = val_length_train // batchsize
valid_steps = val_length // batchsize

history = model.fit_generator(
            generator=train_gen,
            epochs=3,
            verbose=0,
            steps_per_epoch=steps,
            validation_data=valid_gen,
            validation_steps=valid_steps,
            callbacks=[tensorboard])

print("what happened?")

So, you've got a problem involving a generator. It would be nice to have the generator to see. — Daniel Möller, Feb 03 '20 at 20:23
I am away from my computer but the generator is a standard template I use and it works fine in 5 projects. Not sure why it would be the issue but your correct I should supply it. — Robi Sen, Feb 03 '20 at 21:09
Here is the generator. It is Quadrature data, IQ data from a radio. — Robi Sen, Feb 03 '20 at 21:13
Here is a similar issue. I tried the authors approach and it did not help. I did rebuild a new conda environment and still same behaviour https://stackoverflow.com/questions/57350547/python-code-using-keras-crashes-on-call-to-model-fit-with-no-error-code — Robi Sen, Feb 03 '20 at 22:10
I am a little confused. Are you saying I should add multiprocessor=false in my fit_generator? I thought the default was false? I'll try it though. Thanks — Robi Sen, Feb 04 '20 at 02:39
So I explicitly turned set it to false like f use_multiprocessing=False and i get the same outcome — Robi Sen, Feb 04 '20 at 03:11
So, what is the problem actually? You get the message "what happened" printed, right? — Daniel Möller, Feb 04 '20 at 12:00
I added "what happened" to see of the script ran all the way through which it did. Still trying to de ug — Robi Sen, Feb 04 '20 at 15:42
No data is processed. It hits got gen and stops with no error. I may have found the issue. I am printing values of each var and found a bug we're I reset the steps value to 0. Im testing this now — Robi Sen, Feb 04 '20 at 22:10
Ok it works after I fixed the step var issue. Thank you for your questions. They drove me to a solution. — Robi Sen, Feb 04 '20 at 22:31

score 0 · Answer 1 · answered Feb 03 '20 at 22:06

Sort of. If there is a error it will still be thrown and printed of verbose is 0. That being said verbose 0 seems to cause issues for some people. This post is from 2017 but I've seen the same issue as recent as Nov 2019 https://github.com/keras-team/keras/issues/5818. If I use 0 or 2 things work fine but all of that is irrelevant since the script never seems to start grabbing data or training. I appreciate the feedback.

Keras stops after fit_generator with no error

Define ResNet Model

1 Answers1