I have a script that performs 5-fold cross validation on an image set with the pretrained Resnet50 model using Ktrain, which is just a wrapper for Tensorflow Keras. A model for each fold is trained using 30 epochs, and the CV is repeated 5 additional times.
The first fold of the first run trains fast enough for my purposes, about 6 minutes per epoch:
Epoch 1/10
288/288 [==============================] - 370s 1s/step - loss: 13.0123 - mse: 13.0123 - val_loss: 2.8116 - val_mse: 2.8116
Epoch 2/10
288/288 [==============================] - 367s 1s/step - loss: 6.2146 - mse: 6.2146 - val_loss: 2.2179 - val_mse: 2.2179
However, the training time for each epoch in successive models is substantially higher, about 30 minutes for a single epoch.
begin training using onecycle policy with max lr of 0.0001...
Epoch 1/10
286/286 [==============================] - 2229s 8s/step - loss: 8.5098 - mse: 8.5098 - val_loss: 2.3128 - val_mse: 2.3128
Epoch 2/10
286/286 [==============================] - 2213s 8s/step - loss: 5.2229 - mse: 5.2229 - val_loss: 2.4311 - val_mse: 2.4311
At the end of each fold I use the Ktrain function release_gpu_memory
, defined as:
def release_gpu_memory(device=0):
"""
```
Relase GPU memory allocated by Tensorflow
Source:
https://stackoverflow.com/questions/51005147/keras-release-memory-after-finish-training-process
```
"""
from numba import cuda
K.clear_session()
cuda.select_device(device)
cuda.close()
return
A common solution I see is to use the Keras clear_session()
function, which this function includes. However, it appears not to be helping. What can I do to keep a consistent training time across each iteration? Below is my script:
import os
import ktrain
from ktrain import vision as vis
from ktrain.vision.data import images_from_df
from ktrain.core import release_gpu_memory
import pandas as pd
from pprint import pprint
import glob2
from sklearn.model_selection import (GroupShuffleSplit, StratifiedGroupKFold)
from sklearn.model_selection import GroupKFold
import matplotlib.pyplot as plt
from IPython.display import display, HTML
import numpy as np
import tensorflow as tf
import multiprocessing
#############################
# Initial settings excluded #
#############################
# Intialize results df
results = labels.copy()
for r in range(runs):
# Generate splits
gkf = GroupKFold(n_splits=5)
for i, (train_index, val_index) in enumerate(gkf.split(filtered, groups=filtered[group]), 1): # start at fold 1, not 0
# Create training and validation data sets and image generators
train, val = filtered.iloc[train_index], filtered.iloc[val_index]
(train_img, val_img, preproc) = images_from_df(train_df=train, image_column='id', label_columns='DIFF',directory=f"images/{expt}", suffix='.tif',
val_df=val, is_regression=True, target_size=dim, color_mode='rgb')
# Create model
model = vis.image_regression_model(name='pretrained_resnet50', train_data=train_img, val_data=val_img,
freeze_layers=None, metrics=['mse'])
learner = ktrain.get_learner(model=model, train_data=train_img, val_data=val_img,
workers=multiprocessing.cpu_count()-1, use_multiprocessing=False, batch_size=64)
# Train model
print(f'Run {r+1} Fold {i}')
learner.fit_onecycle(1e-4, epochs)
# PLot training and validation loss
learner.plot()
# Create Predictor instance
predictor = ktrain.get_predictor(learner.model, preproc)
def predict_diff(row):
id = row['id']
fname = f'images/{expt}/{id}.tif'
pred = round(predictor.predict_filename(fname, return_proba=True)[0])
return pred
mask = results.id.isin(val.id)
results.loc[mask, f'Run_{r+1}'] = results[mask].apply(lambda row: predict_diff(row), axis=1)
release_gpu_memory()
results.to_csv("21PLTR-NNN_PED.csv", index=False)
I tried using the built-in release_gpu_memory
function provided by Ktrain, which includes the Keras clear_session
function. I expected each successive epoch to have similar training times as those in the first fold. However, the training time increase substantially.