0

I have been using keras to train a CNN. Earlier today I was able to train a model fine tuning VGG16 and decided to compare this to a fine tuned VGG19 model.

Initially I was getting the "Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.""

and with a stackoverflow search I found that my CUDA was not compatible with my tensorflow version. So I found what versions were compatible and worked together and installed them.

I went to rerun my VGG16 model to make sure everything was okay and working...but now I get...


---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-13-4d283cf7c5f9> in <module>
      4     validation_data = validation_generator,
      5     validation_steps = 700,
----> 6     epochs = 5)

~\Anaconda3\lib\site-packages\keras\legacy\interfaces.py in wrapper(*args, **kwargs)
     89                 warnings.warn('Update your `' + object_name + '` call to the ' +
     90                               'Keras 2 API: ' + signature, stacklevel=2)
---> 91             return func(*args, **kwargs)
     92         wrapper._original_function = func
     93         return wrapper

~\Anaconda3\lib\site-packages\keras\engine\training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
   1416             use_multiprocessing=use_multiprocessing,
   1417             shuffle=shuffle,
-> 1418             initial_epoch=initial_epoch)
   1419 
   1420     @interfaces.legacy_generator_methods_support

~\Anaconda3\lib\site-packages\keras\engine\training_generator.py in fit_generator(model, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
    179             batch_index = 0
    180             while steps_done < steps_per_epoch:
--> 181                 generator_output = next(output_generator)
    182 
    183                 if not hasattr(generator_output, '__len__'):

~\Anaconda3\lib\site-packages\keras\utils\data_utils.py in get(self)
    707                     "`use_multiprocessing=False, workers > 1`."
    708                     "For more information see issue #1638.")
--> 709             six.reraise(*sys.exc_info())

~\Anaconda3\lib\site-packages\six.py in reraise(tp, value, tb)
    694             if value.__traceback__ is not tb:
    695                 raise value.with_traceback(tb)
--> 696             raise value
    697         finally:
    698             value = None

~\Anaconda3\lib\site-packages\keras\utils\data_utils.py in get(self)
    683         try:
    684             while self.is_running():
--> 685                 inputs = self.queue.get(block=True).get()
    686                 self.queue.task_done()
    687                 if inputs is not None:

~\Anaconda3\lib\multiprocessing\pool.py in get(self, timeout)
    681             return self._value
    682         else:
--> 683             raise self._value
    684 
    685     def _set(self, i, obj):

~\Anaconda3\lib\multiprocessing\pool.py in worker(inqueue, outqueue, initializer, initargs, maxtasks, wrap_exception)
    119         job, i, func, args, kwds = task
    120         try:
--> 121             result = (True, func(*args, **kwds))
    122         except Exception as e:
    123             if wrap_exception and func is not _helper_reraises_exception:

~\Anaconda3\lib\site-packages\keras\utils\data_utils.py in next_sample(uid)
    624         The next value of generator `uid`.
    625     """
--> 626     return six.next(_SHARED_SEQUENCES[uid])
    627 
    628 

~\Anaconda3\lib\site-packages\keras_preprocessing\image\iterator.py in __next__(self, *args, **kwargs)
    102 
    103     def __next__(self, *args, **kwargs):
--> 104         return self.next(*args, **kwargs)
    105 
    106     def next(self):

~\Anaconda3\lib\site-packages\keras_preprocessing\image\iterator.py in next(self)
    114         # The transformation of images is not under thread lock
    115         # so it can be done in parallel
--> 116         return self._get_batches_of_transformed_samples(index_array)
    117 
    118     def _get_batches_of_transformed_samples(self, index_array):

~\Anaconda3\lib\site-packages\keras_preprocessing\image\iterator.py in _get_batches_of_transformed_samples(self, index_array)
    228                            color_mode=self.color_mode,
    229                            target_size=self.target_size,
--> 230                            interpolation=self.interpolation)
    231             x = img_to_array(img, data_format=self.data_format)
    232             # Pillow images should be closed after `load_img`,

~\Anaconda3\lib\site-packages\keras_preprocessing\image\utils.py in load_img(path, grayscale, color_mode, target_size, interpolation)
    130                         ", ".join(_PIL_INTERPOLATION_METHODS.keys())))
    131             resample = _PIL_INTERPOLATION_METHODS[interpolation]
--> 132             img = img.resize(width_height_tuple, resample)
    133     return img
    134 

~\Anaconda3\lib\site-packages\PIL\Image.py in resize(self, size, resample, box)
   1780             return self.convert('RGBa').resize(size, resample, box).convert('RGBA')
   1781 
-> 1782         self.load()
   1783 
   1784         return self._new(self.im.resize(size, resample, box))

~\Anaconda3\lib\site-packages\PIL\TiffImagePlugin.py in load(self)
   1065     def load(self):
   1066         if self.use_load_libtiff:
-> 1067             return self._load_libtiff()
   1068         return super(TiffImageFile, self).load()
   1069 

~\Anaconda3\lib\site-packages\PIL\TiffImagePlugin.py in _load_libtiff(self)
   1157 
   1158         if err < 0:
-> 1159             raise IOError(err)
   1160 
   1161         return Image.Image.load(self)

OSError: -2```


Any idea?

cdr
  • 21
  • 1
  • 7
  • Oserror -2 means "unknown error". It is about the generator creation so could be some file IO problem, so check it twice, if all image exists in the path it needs, though from thr PIL code I see, that PIL is about to handle such problem specifically. So hard to say, maybe some other guy met with it. Whose version did you change tensorflow or cuda or both? Anyway you aligned tensorflow and cuda, but the error comes from PIL, so it could be some compatibility issue between keras and PIL as well. – Geeocode Jan 03 '20 at 01:46
  • both, maybe I should try downgrading tensorflow and cuda? – cdr Jan 03 '20 at 01:53
  • Rather try to upgrade Keras if possible, as the actual problem seems to be between Keras and PIL (but of course it can be deeper as well, but I am trying to approach from upside first). If it doesn't work, then try to downgrade tf and find the compatible cuda version. – Geeocode Jan 03 '20 at 02:11
  • You can see the code that triggers the error here: https://github.com/python-pillow/Pillow/blob/master/src/PIL/TiffImagePlugin.py#L1182 its most likely a corrupt file, as err becomes negative only when the decoder cannot decode the file. Its quite common to have corrupt or incomplete image files that produce weird errors. – Dr. Snoopy Jan 03 '20 at 22:22
  • What I ended up doing was uninstalling and reinstalling conda. I felt I may have messed somewhere installing and uninstalling associated packages to make it work. This time around I ended up making sure, my version of python, keras, tensorflow and cuda all worked together and everything worked out! – cdr Jan 05 '20 at 01:16

0 Answers0