3

Trying to load a chest x-ray DICOM file that has JPEG2000 compression, extract the pixel array, crop it, and then save as a new DICOM file. Tried this on a Windows10 and MacOS machine, but getting similar errors. Running Python 3.6.13, GDCM 2.8.0, OpenJpeg 2.3.1, Pillow 8.1.2 in a conda environment (installed OpenJPEG and GDCM first before installing Pillow and Pydicom).

My initial code:

file_list = [f.path for f in os.scandir(basepath)]
ds = pydicom.dcmread(file_list[0])
arr = ds.pixel_array
arr = arr[500:1500,500:1500]
ds.Rows = arr.shape[0]
ds.Columns = arr.shape[1]
ds.PixelData = arr.tobytes()
outputpath = os.path.join(basepath, "test.dcm")
ds.save_as(outputpath)

Subsequent error: ValueError: With tag (7fe0, 0010) got exception: (7FE0,0010) Pixel Data has an undefined length indicating that it's compressed, but the data isn't encapsulated as required. See pydicom.encaps.encapsulate() for more information

I then tried modifying the ds.PixelData line to ds.PixelData = pydicom.encaps.encapsulate([arr.tobytes()]) which creates the .dcm without error, but when I open the .dcm to view it doesn't show any image (all black).

My next attempt was to see if I needed to somehow compress back to JPEG2000, so I attempted:

arr = Image.fromarray(arr)
output = io.BytesIO()
arr.save(output, format='JPEG2000')

but then I get error: OSError: encoder jpeg2k not available. I also tried format='JPEG' but then it tells me OSError: cannot write mode I;16 as JPEG ...

Any help much appreciated!

anbk
  • 63
  • 1
  • 7
  • Forgot to mention I also tried the original code (without encapsulate), but added a line `ds.decompress()` after the `pydicom.dcmread` line, and also was able to save a .dcm, but end up with a blank image again. Also should mention for imports was using `import pydicom`, `from PIL import Image`, `import io`, and `import numpy as np` – anbk Apr 01 '21 at 23:58
  • 1
    Decompressing the dataset and writing back uncompressed pixel data is certainly the easiest way to go - compressing the data set can be tricky. If your resulting image was black I would guess the problem is with the data or the data representation. – MrBean Bremen Apr 02 '21 at 08:11
  • I noticed that if I used pydicom's apply_voi_lut() function that I was able to get the image from black to some of the image visible, but with the contrast/brightness messed up and not able to change the image contrast in any DICOM viewers. Maybe I need to modify some DICOM tags to get it working, but thought ds.decompress() automatically adjusts the relevant tags. Think I was able to find a solution using `from imagecodecs import jpeg2k_encode` Will post back once I confirm it works as expected – anbk Apr 02 '21 at 18:08
  • `apply_modality_lut/apply_voi_lut()` are for displaying DICOM data yourself, e.g. creating drawing data from the image data - the DICOM viewers will do this themselves, so they should get the unchanged values. – MrBean Bremen Apr 02 '21 at 18:31

1 Answers1

0

Was able to get this working by using the imagecodecs library and the jpeg2k_encode function. One potential pitfall is you need to .copy() the array to meet the function's C contiguous requirement, which you can confirm by running arr_crop.flag if you needed to. Here is the updated code that worked best for me:

import os
import numpy as np
import matplotlib.pyplot as plt
import pydicom
from pydicom.encaps import encapsulate
from pydicom.uid import JPEG2000
from imagecodecs import jpeg2k_encode

file_list = [f.path for f in os.scandir(basepath)]
ds = pydicom.dcmread(file_list[0])
arr = ds.pixel_array
#Need to copy() to meet jpeg2k_encodes C contiguous requirement
arr_crop = arr[500:1500,500:1500].copy() 
# jpeg2k_encode to perform JPEG2000 compression
arr_jpeg2k = jpeg2k_encode(arr_crop)
# convert from bytearray to bytes before saving to PixelData
arr_jpeg2k = bytes(arr_jpeg2k)
ds.Rows = arr_crop.shape[0]
ds.Columns = arr_crop.shape[1]
ds.PixelData = encapsulate([arr_jpeg2k])
outputpath = os.path.join(basepath, "test.dcm")
ds.save_as(outputpath)

I also ended up using the interactivecrop package to relatively quickly get the crop indices I needed (a tip in case future folks try this in jupyter). In case it's helpful, here's a snippet of code for that (which is run before the above):

from interactivecrop.interactivecrop import main as crop
file_names = [os.path.split(f)[1].split(".")[0] for f in file_list]
image_list = []
for x in file_list:
    ds = pydicom.dcmread(x)
    arr = ds.pixel_array
    image_list.append(arr)
crop(image_list, file_names, optimize=True)
#After cropping all images, will print a dictionary
#copied and pasted this dictionary to a new cell as crop_dict
#use the below function to convert the output to actual indices
def convert_crop_to_index(fname, crop_dict):
    x = [crop_dict[fname][1], crop_dict[fname][1] + crop_dict[fname][3]]
    y = [crop_dict[fname][0], crop_dict[fname][0] + crop_dict[fname][2]]
    return x, y
arr_crop = arr[x[0]:x[1],y[0]:y[1]].copy()

Was never able to quite figure out why ds.decompress() and saving the decompressed dicom was generating an all black image. I feel like that should have been the easiest method, but the above ended up working for me, so I'm happy was able to figure it out.

anbk
  • 63
  • 1
  • 7
  • 1
    There seems to be an issue with GDCM and JPEG2K Pixel Data that include the JP2 header (which they're not supposed to). See [this issue](https://github.com/pydicom/pydicom/issues/1320) for another example. – scaramallion Apr 05 '21 at 23:27
  • That's interesting, thanks for sharing! Had no idea how to deal with the byte form of PixelData directly. It may not be related to this particular issue since I am not having a problem viewing/decoding the jpeg2k encoded data, but rather the writing/encoding part. – anbk Apr 06 '21 at 16:09