0

I am facing some problems to write the getitem() function in my dataset class. I am working on a MRI dataset (3D). Each file consists of 160 slices in DICOM format. I have transformed the DICOM files into PNG.

The structure of the files looks like this: "/content/drive/MyDrive/mris/9114036/11288003"

Inside the last directory there are the 160 2D slices. The labels are in a .csv file with two columns, one with the id (9114036 for example in the path above) and the other with the grade.

The code I tried to execute was:

class MyDataset(Dataset):
    
    def __init__(self, csv_file, root_dir, transform = None):
        self.labels_df = pd.read_csv(csv_file, sep = ';')
        self.root_dir = root_dir
        self.transform = transform
    
    def __len__(self):
        return len(self.labels_df)
    
    def __getitem__(self, idx):
        if torch.is_tensor(idx):
            idx = idx.tolist()
        
        img_name = os.path.join(self.root_dir,str(self.labels_df.iloc[idx,0]))
        image = io.imread(img_name, plugin='matplotlib')
        grade = self.labels_df.iloc[idx, 1]
        sample = {'image': image, 'grade': grade}
    
        if self.transform:
            sample = self.transform(sample)
        
        return sample

The error I got when I tried to access a sample from the dataset was:

/usr/local/lib/python3.7/dist-packages/PIL/Image.py in open(fp, mode) 
2841 
2842 if filename: -> 
2843 fp = builtins.open(filename, "rb") 
2844 exclusive_fp = True 
2845 IsADirectoryError: [Errno 21] Is a directory: '/content/drive/MyDrive/mris/9114036'

which seems logical.

I tried to use os.walk to get in the 11288003 directory where the images are, but it didn't work. Most likely my whole approach is wrong.

Does anybody know how to write class dataset for the 3D nature of my data? Should I use another transformation for the DICOM files in the first place ?

Amit Joshi
  • 15,448
  • 21
  • 77
  • 141
  • Could you please try printing out "img_name" just before the "imread"? – Mark Lavin Dec 09 '21 at 19:21
  • If I print img_name the system output is the path to the MRI. For example if I set idx =1, I get: img_name = '/content/drive/MyDrive/images_png/9002430' – Kostas Gkrispanis Dec 09 '21 at 19:30
  • Hmm... looks right. How about trying to just "open" the "img_name" file? I'm not familiar with the plug-in feature of "imread". – Mark Lavin Dec 09 '21 at 20:04
  • I am not quite sure if I understood what you meant. Did you mean to replace the io.imread() with open() ? – Kostas Gkrispanis Dec 09 '21 at 20:11
  • Yes, just to see whether the problem is with the file itself, or with imread. – Mark Lavin Dec 09 '21 at 20:12
  • Nope, I get the same error. IsADirectoryError: [Errno 21] Is a directory: '/content/drive/MyDrive/images_png/9002430' – Kostas Gkrispanis Dec 09 '21 at 21:50
  • Can you list the contents of that file/dir. If you have access to a shell, issue the command "ls -al /content/drive/MyDrive/images_png/9002430" I want to see if Linux thinks it's a directory. – Mark Lavin Dec 09 '21 at 22:19
  • I am working on Google Colab – Kostas Gkrispanis Dec 10 '21 at 11:28
  • In that case, say "! ls -al ... " – Mark Lavin Dec 10 '21 at 13:41
  • For some reason I get this message ls: cannot access 'r/content/drive/MyDrive/images_png/9002430': No such file or directory Do you have any idea why ? – Kostas Gkrispanis Dec 10 '21 at 21:16
  • if I dont put the 'r' before the directory I get another error: total 4 drwx------ 2 root root 4096 Dec 7 12:42 11172803 – Kostas Gkrispanis Dec 10 '21 at 21:17
  • Well, you originally said 'The structure of the files looks like this: "/content/drive/MyDrive/mris/9114036/11288003"'. I don't see how this relates to ''/content/drive/MyDrive/images_png/9002430". I'm confused... – Mark Lavin Dec 10 '21 at 21:51
  • You are right I am sorry. The folder name on my local disk was mris. The folder name in drive was images_png. It's the same folder with different name. – Kostas Gkrispanis Dec 10 '21 at 22:07
  • 9114036 and 9002430 are two different mris. There is another subfolder (for 9114036 it's 11288003) where the 160 2D slices of MRIs are stored (transformed in PNG) I hope I made that clear for you. Thank you for trying to help me! I appreciate it – Kostas Gkrispanis Dec 10 '21 at 22:08
  • So, could you please tell me the name of a PNG file that you want to manipulate? That's the name that should be passed to "imread" – Mark Lavin Dec 10 '21 at 22:10
  • The problem is that I don't want to pass one PNG to "imread". I want to pass the whole folder (with the 160 PNG files) as there is one label for all of them. I guess I need to pass the whole folder. The structure is like this: "/content/drive/MyDrive/mris/9114036/11288003/001.png" "/content/drive/MyDrive/mris/9114036/11288003/002.png" .... "/content/drive/MyDrive/mris/9114036/11288003/160.png" – Kostas Gkrispanis Dec 10 '21 at 22:13
  • https://stackoverflow.com/questions/37340129/tensorflow-training-on-my-own-image – Mark Lavin Dec 10 '21 at 22:17
  • You should probably convert the DICOMs to Nifti files, which you can then load (using `nibabel` for example) to get a 3D numpy array (easily convertible to a torch Tensor). If I understand you correctly, that's what your model needs anyway. – asdf Jun 09 '22 at 15:08

0 Answers0