8

Tried to load training data with pytorch torch.datasets.ImageFolder in Colab.

transform = transforms.Compose([transforms.Resize(400),
                                transforms.ToTensor()])
dataset_path = 'ss/'
dataset = datasets.ImageFolder(root=dataset_path, transform=transform)
dataloader = torch.utils.data.DataLoader(dataset, batch_size=20)

I encountered the following error :

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-27-7abcc1f434b1> in <module>()
      2                                 transforms.ToTensor()])
      3 dataset_path = 'ss/'
----> 4 dataset = datasets.ImageFolder(root=dataset_path, transform=transform)
      5 dataloader = torch.utils.data.DataLoader(dataset, batch_size=20)

3 frames
/usr/local/lib/python3.7/dist-packages/torchvision/datasets/folder.py in make_dataset(directory, class_to_idx, extensions, is_valid_file)
    100         if extensions is not None:
    101             msg += f"Supported extensions are: {', '.join(extensions)}"
--> 102         raise FileNotFoundError(msg)
    103 
    104     return instances

FileNotFoundError: Found no valid file for the classes .ipynb_checkpoints. Supported extensions are: .jpg, .jpeg, .png, .ppm, .bmp, .pgm, .tif, .tiff, .webp

My Dataset folder contains a subfolder with many training images in png format, still the ImageFolder can't access them.

4 Answers4

13

I encountered the same problem when I was using IPython notebook-like tools.

First please check if there is any hidden files under your dataset_path. Use ls -a if you are under a Linux environment.

The case happen to me is I found a hidden file called .ipynb_checkpoints which is located parallelly to image class subfolders. I think that file causes confusion to PyTorch dataset. I made sure it is not useful so I simply deleted it. Then the dataset works fine.

Or if you would like to simply ignore that file, you may also try this.

Y. Zhang
  • 146
  • 5
1

The files in the image folder need to be placed in the subfolders for each class, like this:

root/dog/xxx.png
root/dog/xxy.png
root/dog/[...]/xxz.png

root/cat/123.png
root/cat/nsdf3.png
root/cat/[...]/asd932_.png

https://pytorch.org/vision/stable/datasets.html#torchvision.datasets.ImageFolder

Are your files in ss dir organized in this way?

Sergii Dymchenko
  • 6,890
  • 1
  • 21
  • 46
0

1- The files in the image folder need to be placed in the subfolders for each class (as said Sergii Dymchenko)

2- Put the absolute path when using google colab

  • 1
    As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Nov 01 '21 at 13:52
0

The solution for google colaboratory:
When you creating a directory, coollaboratory additionally creates .ipynb_checkpoints in it.
To solve the problem, it is enough to remove it from the folder containing directories with images (i.e. from the train folder). You need to run:

!rm -R test/train/.ipynb_checkpoints
!ls test/train/ -a   #to make sure that the deletion has occurred

where test/train/ is my path to datasets folders

Ruli
  • 2,592
  • 12
  • 30
  • 40