0

I want to read only the header from a .mha file. Ideally it would give me the same as if reading .mhd file with metadata only.
I am using SimpleITK for reading and while it reads the image fine, I specifically want to create a metadata dictionary.
.mha is basically .mhd with the raw image combined but I can't find any specification as to how to separate them. I found in ITK docs here:

To skip the header bytes in the image data file, use
HeaderSize = X
where X is the number of bytes to skip at the beginning of the file before reading image data. If you know there are no trailing bytes (extra bytes at the end of the file) you can specify
HeaderSize = -1
and MetaImage will automatically calculate the number of extract bytes in the data file, assume they those bytes are at the head of the data file, and automatically skip them before beginning to read the image data.

but I can't find this functionality in SimpleITK.

Jarartur
  • 149
  • 1
  • 10

1 Answers1

1

There is a SimpleITK feature that allows you to do this, which is often employed for DICOM and nifti file formats. I've used it to read just MHD files without pulling the raw data into memory:

import pathlib
import SimpleITK as sitk

#Use pathlib so it works on whatever OS
mha_dir = pathlib.Path("/you/files/location")
mha_file = str(mha_dir.joinpath(mha_file))

#Set up the reader and get the file information 
reader = sitk.ImageFileReader()
reader.SetFileName(mha_file)   # Give it the mha file as a string
reader.LoadPrivateTagsOn()     # Make sure it can get all the info
reader.ReadImageInformation()  # Get just the information from the file

# From here you can just parse out whatever you want, just like a SimpleITK image

xdim, ydim, zdim = reader.GetSize() # If you want the x, y, z 
xres, yres, zres = reader.GetSpacing() # If you want the image resolution, etc.

meta_keys = reader.GetMetaDataKeys()
for key in meta_keys:
    print(key)
    print(reader.GetMetaData(f'{key}'))


NBStephens
  • 348
  • 2
  • 7
  • It doesn't solve my problem unfortunately because I don't want to read the image, rather I want to have only the header info and here sitk doesn't show that in its metadata and there are not enough `Get()` functions to extract all of it (ex. anatomical orientation) – Jarartur Apr 30 '21 at 06:06
  • My RAM never climbs with a 14 GB file using this, so I don't think it will read in the file. If you want the metadata that is present in the mha, you use can use reader.GetMetaDataKeys(), then you can just run them through like: reader.GetMetaData('ITK_original_direction'), etc. – NBStephens Apr 30 '21 at 16:31
  • That's the problem here. SimpleITK doesn't read header information. I tried your method first before posting on stack but it returns one metadata key about being read with MetaImageIO and nothing else. – Jarartur May 02 '21 at 11:09
  • That's really strange. I get 5 metadata keys available to me. What are you using to write the mha files? I wonder if it has to do with the format of what is being written and then recognized by SimpleITK. Would NIFTI file format or the human readable MHD file type work for what you are attempting? If not, it may be worthwhile to explore parsing the bytes directly, which is a bit more complicated. – NBStephens May 03 '21 at 13:33
  • I tried with nifti and dicom and it works there just like you described but with mha or mhd I just can't get it to work. Maybe I have a weird dataset. I tried to parse bytes directly but every method i came up with or found includes reading all the image data which is suboptimal for performance reasons (it will be used to read many series of images) – Jarartur May 04 '21 at 09:40