Keras - EffNetB3 and B6 and ResNet50 - Out of Memory issue when trying to unfreeze model and fine-tune

Question

I'm trying to learn Image classification and have got some models working on datasets that I have generated. I've used ResNet50, EfficientNetB3 and B6 for transfer learning and I have a working model at the end but would like to improve their accuracy by Fine-tuning my model.

This is where I'm running into memory problems, Looking at other people with similar problems I don't think I should be running into memory problems. I'm using Colab pro and have briefly paid for a 40Gb GPU and it still says OOM when running ResNet50 or EfficientNetB3.

I built my model like this:

#Set image dataset location, image size and batch size 
data_root = ("/content/FOD/DATA/Train/")
TRAINING_DATA_DIR = str(data_root)
image_size = (400, 400)
batch_size = 16

#Split the dataset into train and validation sets
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    TRAINING_DATA_DIR,
    validation_split=0.2,
    subset="training",
    seed=1337,
    image_size=image_size,
    batch_size=batch_size,
)
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
    TRAINING_DATA_DIR,
    validation_split=0.2,
    subset="validation",
    seed=1337,
    image_size=image_size,
    batch_size=batch_size,
)

base_model = keras.applications.ResNet50(
    weights = 'imagenet',  # Load weights pre-trained on ImageNet.
    input_shape = (400, 400, 3),
    pooling = 'max',
    include_top = False)  # Do not include the ImageNet classifier at the top.

base_model.trainable = False

inputs = keras.Input(shape=(400, 400, 3))
# We make sure that the base_model is running in inference mode here,
# by passing `training=False`. This is important for fine-tuning, as you will
# learn in a few paragraphs.
x = base_model(inputs, training=False)
# Convert features of shape `base_model.output_shape[1:]` to vectors
# x = keras.layers.GlobalAveragePooling2D()(x)
x = keras.layers.Flatten()(x)
x = keras.layers.Dropout(0.2)(x)
# A Dense classifier with a single unit (binary classification)
outputs = keras.layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(inputs, outputs)

model.compile(optimizer=keras.optimizers.Adam(),
              loss = keras.losses.binary_crossentropy,
              metrics=[keras.metrics.BinaryAccuracy()])

keras_callbacks   = [
      EarlyStopping(monitor='val_loss', patience=2, mode='min', min_delta=0.0001),
      ModelCheckpoint(filepath="/content/gdrive/MyDrive/MODELS/ResNet50-4Items-CMH/", save_weights_only=False, monitor='val_loss', save_best_only=True, mode='min')
]

hist = model.fit(train_ds, epochs=30, callbacks = [keras_callbacks], validation_data=val_ds

This all seems to run ok, and gives me a model at the end, My code to try to unfreeze layers and fine-tune is:

###############################Fine-Tuning##################################################
# Load the model you wish to fine-tune
tf.keras.backend.clear_session()
model = tf.keras.models.load_model('*link to drive location of my model*')
##View the model layers
model.summary()

#Set the model to save the best epoches with highest validation accuracy
model_checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
    filepath="*link to save location of my model*",
    save_weights_only=False,
    monitor='val_binary_accuracy',
    mode='max',
    save_best_only=True)
#Train the model with only 2 epochs and a batch size of 16 to decrease memory usage
#model.fit(train_ds, epochs=2, batch_size = 8, callbacks = [model_checkpoint_callback], validation_data=val_ds)

def unfreeze_model(model):
    # We unfreeze the top 20 layers while leaving BatchNorm layers frozen
    for layer in model.layers[-10:]:
        if not isinstance(layer, layers.BatchNormalization):
            layer.trainable = True

unfreeze_model(model)

##Set the learning rate to be low so as to only make small changes to an already good model
optimizer = tf.keras.optimizers.Adam(learning_rate=1e-5)
model.compile(optimizer=optimizer, loss=keras.losses.binary_crossentropy, metrics=[keras.metrics.BinaryAccuracy()])
model.summary()

hist = model.fit(train_ds, epochs=10, batch_size=2, callbacks = [model_checkpoint_callback], validation_data=val_ds)

I've tried reducing batch size, etc and have also moved to what the internet calls smaller models, but I am always running into OOM when fine-tuning.

The output of the fine-tuning is:

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   

=================================================================

 input_2 (InputLayer)        [(None, 400, 400, 3)]     0         
                                                                 
 resnet50 (Functional)       (None, 2048)              23587712  
                                                                 
 flatten (Flatten)           (None, 2048)              0         
                                                                 
 dropout (Dropout)           (None, 2048)              0         
                                                                 
 dense (Dense)               (None, 1)                 2049      
                                                                 
=================================================================
Total params: 23,589,761
Trainable params: 2,049
Non-trainable params: 23,587,712
_________________________________________________________________
Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   

=================================================================

 input_2 (InputLayer)        [(None, 400, 400, 3)]     0         
                                                                 
 resnet50 (Functional)       (None, 2048)              23587712  
                                                                 
 flatten (Flatten)           (None, 2048)              0         
                                                                 
 dropout (Dropout)           (None, 2048)              0         
                                                                 
 dense (Dense)               (None, 1)                 2049      
                                                                 
=================================================================
Total params: 23,589,761
Trainable params: 23,536,641
Non-trainable params: 53,120

I have tried to reduce by trainable params, but it seems to either be 23mil or 2 thousand, with nothing in between when following tutorials online.

Keras - EffNetB3 and B6 and ResNet50 - Out of Memory issue when trying to unfreeze model and fine-tune

0 Answers0