0

I'm working with a custom data set in the format

folder
│     
│
└--train
    └--──class1
    |         │   file011
    |         │   file012
    |
    |
    └───--class2
          │   file021
          │   file022

└--val
    └--──class1
    |         │   file011
    |         │   file012
    |
    |
    └───--class2
          │   file021
          │   file022

When trying to load the dataset

data_dir = r'PATH_TO_DATA/train'

dataset = datasets.ImageFolder(data_dir, ...)

FileNotFoundError: Found no valid file for the classes Cat, Deer, Dog, Human. Supported extensions are: .jpg, .jpeg, .png, .ppm, .bmp, .pgm, .tif, .tiff, .webp

The only issue I found similar to this was here, however in their case there seemed to be a .ipynb_checkpoints file which was causing the issue. It doesn't appear to be the case here.

I also checked for hidden files, and made sure the extensions are acceptable.

Edit: An important piece of information that I didn't realize was the issue seems to be the issue. I am hosting this data on a remote using Rclone, mounting my onedrive to access the data. When accessing the data directly, the dataset is read just fine. It seems to be an issues of ImageFolder accessing data via the remote access more than anything else.

1 Answers1

0

One way to decompose the problem can be open a file with pillow.

import torch
import torchvision.transforms.funcional as TF
from PIL import Image

img = Image.open('PATH_TO_DATA/train/class1/file011')
img = TF.pil_to_tensor(img)
print(f'[DEBUG] img: {img.shape}, {img.min()}, {img.max()}')
kingsj0405
  • 149
  • 14
  • While this is not the "correct" solution, it helped me with a problem in my dataset. The problem remains when calling the data via remote using Rclone. – Manu Dwivedi Oct 25 '22 at 00:22