0

I have the following structure, where I want to read jpg files from test.

./cats_dogs_small
├── test
│   ├── cats  <- 1000 images
│   └── dogs  <- 1000 images

To read the files, I use the following MWE:

import os
train_dir = os.path.join(os.environ['HOME'], 'Documents/cats_dogs_small')
train_dir = os.path.join(train_dir, 'train')

datagen = ImageDataGenerator(rescale=1./255)
batch_size = 20

def extract_features(directory):
    generator = datagen.flow_from_directory(directory,
                                            target_size=(150, 150),
                                            batch_size=batch_size,
                                            class_mode='binary')
    i = 0
    for inputs_batch, labels_batch in generator:
        print(i, end=' ')
        i += 1
    return features, labels

train_features, train_labels = extract_features(train_dir)

Every time I run it, I get the same error message:

2020-11-19 16:08:56.973416: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2020-11-19 16:08:56.973436: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Found 2000 images belonging to 2 classes.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Traceback (most recent call last):

  File "/~/Documents/keras/untitled0.py", line 30, in <module>
    train_features, train_labels = extract_features(train_dir)

  File "/~/Documents/keras/untitled0.py", line 25, in extract_features
    for inputs_batch, labels_batch in generator:

  File "/.local/lib/python3.8/site-packages/keras_preprocessing/image/iterator.py", line 104, in __next__
    return self.next(*args, **kwargs)

  File "/.local/lib/python3.8/site-packages/keras_preprocessing/image/iterator.py", line 116, in next
    return self._get_batches_of_transformed_samples(index_array)

  File "/.local/lib/python3.8/site-packages/keras_preprocessing/image/iterator.py", line 227, in _get_batches_of_transformed_samples
    img = load_img(filepaths[j],

  File "/.local/lib/python3.8/site-packages/keras_preprocessing/image/utils.py", line 114, in load_img
    img = pil_image.open(io.BytesIO(f.read()))

  File "/anaconda3/envs/keras28/lib/python3.8/site-packages/PIL/Image.py", line 2943, in open
    raise UnidentifiedImageError(

UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x7f41286f3090>

The error randomly raises. Here I posted that the code crashed at 60, but sometimes crashes at 43, 69 or any other number. It seems the problem is not related to a specific image, but the way I'm using flow_from_directory / ImageDataGenerator.

Keras version: 2.4.3

user3889486
  • 656
  • 1
  • 7
  • 21
  • 1
    Check if the path exists, if PIL is able to open JPEG files, https://stackoverflow.com/questions/19230991/image-open-cannot-identify-image-file-python – Prabindh Nov 20 '20 at 03:53
  • @Prabindh Thanks! Run the code you reference and found the issue was a corrupted image. Re run the code above with the fixed image and woks as expected – user3889486 Nov 20 '20 at 16:52

0 Answers0