-1

I am doing segmentation via deep learning in pytorch. My dataset is a .raw/.mhd format ultrasound images. I want to input my dataset into the system via data loader.

I faced few important questions:

  • Does changing the format of the dataset to either .png or .jpg make the segmentation inaccurate?(I think I lost some information in this way!)

  • Which format is less data lossy?

  • How should I make a dumpy array if I don't convert the original image format, i.e., .raw/.mhd?

  • How should I load this dataset?

Anubhav Singh
  • 8,321
  • 4
  • 25
  • 43
gxa
  • 1
  • 3

1 Answers1

0

Knowing nothing about raw and mhd formats, I can give partial answers.

Firstly, jpg is lossy and png is not. So, you're surely losing information in jpg. png is lossless for "normal" images - 1, 3 or 4 channel, with 8 bit precision in each (perhaps also 16 bits are also supported, don't quote me on that). I know nothing about ultrasound images, but if they use higher precision than that, even png will be lossy.

Secondly, I don't know what mhd is and what raw means in the context of ultrasound images. That being said, a simple google search reveals some package for reading the former to numpy.

Finally, to load the dataset, you can use the ImageFolder class from torchvision. You need to write a custom function which loads an image given its path (for instance using the package mentioned above) and pass it to the loader keyword argument.

Jatentaki
  • 11,804
  • 4
  • 41
  • 37