1

I created a DataGenerator with Sequence class.

import tensorflow.keras as keras
from skimage.io import imread
from skimage.transform import resize
import numpy as np
import math
from tensorflow.keras.utils import Sequence

Here, `x_set` is list of path to the images and `y_set` are the associated classes.

class DataGenerator(Sequence):
    def __init__(self, x_set, y_set, batch_size):
    self.x, self.y = x_set, y_set
    self.batch_size = batch_size

def __len__(self):
    return math.ceil(len(self.x) / self.batch_size)

def __getitem__(self, idx):
    batch_x = self.x[idx * self.batch_size:(idx + 1) *
    self.batch_size]
    batch_y = self.y[idx * self.batch_size:(idx + 1) *
    self.batch_size]

    return np.array([
        resize(imread(file_name), (224, 224))
           for file_name in batch_x]), np.array(batch_y)

Then, I applied this to my training and validation data. X_train is a list of strings which contains the image paths to the training data. y_train are onehotencoded labels of the training data. The same for validation data.

I created the image paths using this code:

X_train = []
for name in train_FileName:
  file_path = r"/content/gdrive/My Drive/data/2017-IWT4S-CarsReId_LP-dataset/" + name
  X_train.append(file_path)

After that, I applied the DataGenerator to the training and validation data:

training_generator = DataGenerator(X_train, y_train, batch_size=32)
validation_generator = DataGenerator(X_val, y_val, batch_size=32)

Afterwards I used the fit_generator method to run a model:

model.fit_generator(generator=training_generator,
                    validation_data=validation_generator,
                    steps_per_epoch = num_train_samples // 32,
                    validation_steps = num_val_samples // 32,
                    epochs = 10,
                    use_multiprocessing=True,
                    workers=2)

On CPU it worked fine the first times, my model was initialized and the first epoch started. Then, I changed the runtime type in Google Colab to GPU and ran the model again.

And got the following error:

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-79-f43ade94ee10> in <module>()
      5                     epochs = 10,
      6                     use_multiprocessing=True,
----> 7                     workers=2)

16 frames
/usr/local/lib/python3.6/dist-packages/imageio/core/request.py in _parse_uri(self, uri)
    271                 # Reading: check that the file exists (but is allowed a dir)
    272                 if not os.path.exists(fn):
--> 273                     raise FileNotFoundError("No such file: '%s'" % fn)
    274             else:
    275                 # Writing: check that the directory to write to does exist

FileNotFoundError: No such file: '/content/gdrive/My Drive/data/2017-IWT4S-CarsReId_LP-dataset/s01_l01/1_1.png'

Today, I got this error also when running the program without the usage of GPU. When running the program, Colab told me that there was Google Drive Time Out. So, is this error due to this timeout of Google Drive? And if yes, how can I solve this? Does anyone know what I should change in the program?

Tobitor
  • 1,388
  • 1
  • 23
  • 58
  • Do you mount your Google Drive using `from google.colab import drive`, `drive.mount('/content/gdrive')`? – Mikolasan Jun 25 '20 at 04:17
  • Yes, I did. Without mounting, I think the process would neither work on CPU, but it does work on CPU in Colab, but not on GPU. That is the weird part regarding this issue... – Tobitor Jun 25 '20 at 12:32
  • 1
    Just checking that you have it in your notebook. This code is my first block, so every time I run all blocks in my notebook (Ctrl+F9) I need to go through the process of authorization, which is weird. – Mikolasan Jun 25 '20 at 15:44
  • Yes, you are right. That is a little bit annoying! Thanks for the hint regarding CTRL + F9 ;-) – Tobitor Jun 25 '20 at 17:10

2 Answers2

1

You can write this code to avoid timeout in google colab in console

ConnectButton() {     
    console.log("Connect pushed");      
    document.querySelector("#top-toolbar > colab-connect- 
        button").shadowRoot.querySelector("#connect").click()  
} 
setInterval(ConnectButton,60000);

Source: How to prevent Google Colab from disconnecting?

Usman Ali
  • 374
  • 1
  • 10
0

The problem seems to be the input. Your model cannot find the input file. If you change runtime then there's gonna be a factory reset. All your disk content in the session will be erased.

Run cells from the beginning if you change runtime in between.

Vijeth Rai
  • 321
  • 2
  • 10
  • Yes, I did this. I changed the runtime type and run my code from the very beginning. But unfortunately the error remains... And it is only when I am running the code on GPU. On normal CPU, there is no problem... – Tobitor Jun 19 '20 at 18:22
  • 1
    try using WGET extension for loading dataset. I use this and never had any problem. >>>Install WGET extension... >>> Click download on the file in drive and cancel .... >>> Copy everything written in WGET extension... >>> !paste on colab cell....( ! is important).... >>>Now the file is in your colab disk........... Now in your code : file_path = r"2017-IWT4S-CarsReId_LP-dataset/" + name..... if you download the file 2017-IWT4S-CarsReId_LP-dataset – Vijeth Rai Jun 19 '20 at 18:31
  • Ok, can we maybe chat somehow regarding this? I do not know what you mean by "Copy everything written in WGET extension" – Tobitor Jun 19 '20 at 18:40
  • 1
    When you click on WGET extension in chrome, after cancelling download (no need to download the file, only download initiation is necessary) you will find lot of weird stuff written in it. The last words will be your download file name. Copy all of that – Vijeth Rai Jun 19 '20 at 18:53
  • Do I have to click on "Download all Links"? – Tobitor Jun 19 '20 at 19:01