1

I'm trying to train a model. I have almost 150 classes and I'm using ImageDataGenerator to augment my dataset. I'm also using model checkpoints and csvlogger to save the weights. It gives me an error at a certain point in the first epoch when I start my training. The images I'm using are grayscale images if that helps.

here is my code:

batch_size = 2000
epochs = 10

    # Augments dataset 10x
train_batches = ImageDataGenerator(preprocessing_function=preprocess_func, horizontal_flip=True, width_shift_range=0.1, height_shift_range=0.1, shear_range=0.2, zoom_range=0.2, fill_mode='nearest') \
    .flow_from_directory(directory=train_path, target_size=image_size, classes=dataset_classes, batch_size=5, color_mode='grayscale')
valid_batches = ImageDataGenerator(preprocessing_function=preprocess_func, horizontal_flip=True, width_shift_range=0.15, height_shift_range=0.1, shear_range=0.2, zoom_range=0.2, fill_mode='nearest') \
    .flow_from_directory(directory=valid_path, target_size=image_size, classes=dataset_classes, batch_size=5, color_mode='grayscale')
test_batches = ImageDataGenerator(preprocessing_function=preprocess_func, horizontal_flip=True, width_shift_range=0.15, height_shift_range=0.1, shear_range=0.2, zoom_range=0.2, fill_mode='nearest') \
    .flow_from_directory(directory=test_path, target_size=image_size, classes=dataset_classes, batch_size=5, color_mode='grayscale')

here is my callback:

    from keras.callbacks import ModelCheckpoint, CSVLogger

checkpoint_path = "/content/drive/MyDrive/Colab Notebooks/Datasets/Experiment/weights_improvements-epoch:{epoch:02d}-val_accuracy:{val_accuracy:.2f}.hdf5"
checkpoint_dir = os.path.dirname(checkpoint_path)

# Create a callback that saves the model's weights
cp_callback = ModelCheckpoint(checkpoint_path,
                              verbose=1,
                              monitor='val_accuracy',
                              mode='max',
                              save_best_only=True,
                              save_weights_only=True)

log_folder = '/content/drive/MyDrive/Colab Notebooks/Datasets/Experiment'
log_path = os.path.join(log_folder, 'FSLR_logs.csv')
log_csv = CSVLogger(log_path, separator=',', append=False)

callback_list = [cp_callback, log_csv]

Fitting the model:

# Compile the layers into one model and create a connection
model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.Adam(), metrics=['accuracy'])

# Train the model with the new callback
history = model.fit(x=train_batches,
                    validation_data=valid_batches,
                    batch_size=batch_size,
                    epochs=epochs,
                    callbacks=callback_list)

The error I'm receiving is this:

Epoch 1/10 3428/4128 [=======================>......] - ETA: 26:10 - loss: 4.8299 - accuracy: 0.0078 --------------------------------------------------------------------------- UnknownError Traceback (most recent call last) in () 4 batch_size=batch_size, 5 epochs=epochs, ----> 6 callbacks=callback_list)

6 frames /usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name) 58 ctx.ensure_initialized() 59 tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name, ---> 60 inputs, attrs, num_outputs) 61 except core._NotOkStatusException as e: 62 if name is not None:

UnknownError: OSError: image file is truncated (30 bytes not processed) Traceback (most recent call last):

File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/script_ops.py", line 249, in call ret = func(*args)

File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/autograph/impl/api.py", line 645, in wrapper return func(*args, **kwargs)

File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 892, in generator_py_func values = next(generator_state.get_iterator(iterator_id))

File "/usr/local/lib/python3.7/dist-packages/keras/engine/data_adapter.py", line 822, in wrapped_generator for data in generator_fn():

File "/usr/local/lib/python3.7/dist-packages/keras/engine/data_adapter.py", line 948, in generator_fn yield x[i]

File "/usr/local/lib/python3.7/dist-packages/keras_preprocessing/image/iterator.py", line 65, in getitem return self._get_batches_of_transformed_samples(index_array)

File "/usr/local/lib/python3.7/dist-packages/keras_preprocessing/image/iterator.py", line 230, in _get_batches_of_transformed_samples interpolation=self.interpolation)

File "/usr/local/lib/python3.7/dist-packages/keras_preprocessing/image/utils.py", line 138, in load_img img = img.resize(width_height_tuple, resample)

File "/usr/local/lib/python3.7/dist-packages/PIL/Image.py", line 1886, in resize self.load()

File "/usr/local/lib/python3.7/dist-packages/PIL/ImageFile.py", line 247, in load "(%d bytes not processed)" % len(b)

OSError: image file is truncated (30 bytes not processed)

[[{{node PyFunc}}]] [[IteratorGetNext]] [Op:__inference_train_function_1029]

Function call stack: train_function

I've tried to use the same code in training two classes and it works fine. I don't know why it is not working when I use it on all of my 140+ classes.

Can someone explain to me the problem? I kinda need this for my school project. Thank you in advance!

Edit: I've run this code to verify all the images. It didn't find any corrupted files.

import os
from os import listdir
from PIL import Image

categ = ['Train', 'Valid', 'Test']
dataset = '/content/drive/MyDrive/Colab Notebooks/Datasets/FSLR_Application_Dataset'

for cat in categ:
  img_path = os.path.join(dataset, cat)
  for foldername in listdir(img_path):
    sign_path = os.path.join(img_path, foldername)
    print(sign_path)
    for sign in listdir(sign_path):
      if sign.endswith('.jpg'):
        try:
          img = Image.open(os.path.join(sign_path, sign)) # open the image file
          img.verify() # verify that it is, in fact an image
        except (IOError, SyntaxError) as e:
          print('Bad file:', sign) # print out the names of corrupt files
  • It is likely that one of your image file is corrupted and that triggers the error. – Dr. Snoopy Nov 01 '21 at 02:36
  • I've run the a code to verify the images using PIL.Image.verify, there seems to have no problems with the datasets. – Lord Dickenstein Nov 01 '21 at 05:09
  • The exception thrown is OSError, you are not catching that exception in your verification code, that is why you do not find the corrupted image(s). – Dr. Snoopy Nov 01 '21 at 11:04
  • The program didn't also found an error in the image even after adding OSError exception. Although I just realized that PIL is being used by Keras, and seeing the answers from related questions here in stackoverflow, I have to run `ImageFile.LOAD_TRUNCATED_IMAGES = True` even when I am not directly using the module. I just thought that those people who were having the same problem with me is using PIL directly in their code. – Lord Dickenstein Nov 01 '21 at 13:08

2 Answers2

0

I have had similar problems with finding defective image files. The ImageDataGenerator uses PIL. The generator did not detect an error in the image file if it had it would have printed a warning message. So I suggest you try using something other than PIL to detect defective image files. Try using cv2 I have found it sometimes detects errors that PIL does not. specifically

import cv2
your code but replace 
img = Image.open(os.path.join(sign_path, sign)) # open the image file
          img.verify() # verify that it is, in fact an image
        except (IOError, SyntaxError) as e:
          print('Bad file:', sign) # print out the names of corrupt files
with 
bad_file_list=[]
bad_count=0
try:
    img.cv2.imread(os.path.join(sign_path, sign)
    shape=img.shape # this will throw an error if the img is not read correctly
except:
    bad_file_list.append(os.path.join(sign_path, sign))
    bad_count +=1

Then when outside the loop print out if bad files were found

Gerry P
  • 7,662
  • 3
  • 10
  • 20
0

I faced the same problem before and this work for me, add this line before fitting the model:

from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True

history = model.fit(...) #ur fitting code
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community May 19 '22 at 10:31