Trying to crop image and save dicom with pydicom,

Question

Trying to load a chest x-ray DICOM file that has JPEG2000 compression, extract the pixel array, crop it, and then save as a new DICOM file. Tried this on a Windows10 and MacOS machine, but getting similar errors. Running Python 3.6.13, GDCM 2.8.0, OpenJpeg 2.3.1, Pillow 8.1.2 in a conda environment (installed OpenJPEG and GDCM first before installing Pillow and Pydicom).

My initial code:

file_list = [f.path for f in os.scandir(basepath)]
ds = pydicom.dcmread(file_list[0])
arr = ds.pixel_array
arr = arr[500:1500,500:1500]
ds.Rows = arr.shape[0]
ds.Columns = arr.shape[1]
ds.PixelData = arr.tobytes()
outputpath = os.path.join(basepath, "test.dcm")
ds.save_as(outputpath)

Subsequent error: ValueError: With tag (7fe0, 0010) got exception: (7FE0,0010) Pixel Data has an undefined length indicating that it's compressed, but the data isn't encapsulated as required. See pydicom.encaps.encapsulate() for more information

I then tried modifying the ds.PixelData line to ds.PixelData = pydicom.encaps.encapsulate([arr.tobytes()]) which creates the .dcm without error, but when I open the .dcm to view it doesn't show any image (all black).

My next attempt was to see if I needed to somehow compress back to JPEG2000, so I attempted:

arr = Image.fromarray(arr)
output = io.BytesIO()
arr.save(output, format='JPEG2000')

but then I get error: OSError: encoder jpeg2k not available. I also tried format='JPEG' but then it tells me OSError: cannot write mode I;16 as JPEG ...

Any help much appreciated!

Forgot to mention I also tried the original code (without encapsulate), but added a line `ds.decompress()` after the `pydicom.dcmread` line, and also was able to save a .dcm, but end up with a blank image again. Also should mention for imports was using `import pydicom`, `from PIL import Image`, `import io`, and `import numpy as np` — anbk, Apr 01 '21 at 23:58
Decompressing the dataset and writing back uncompressed pixel data is certainly the easiest way to go - compressing the data set can be tricky. If your resulting image was black I would guess the problem is with the data or the data representation. — MrBean Bremen, Apr 02 '21 at 08:11
I noticed that if I used pydicom's apply_voi_lut() function that I was able to get the image from black to some of the image visible, but with the contrast/brightness messed up and not able to change the image contrast in any DICOM viewers. Maybe I need to modify some DICOM tags to get it working, but thought ds.decompress() automatically adjusts the relevant tags. Think I was able to find a solution using `from imagecodecs import jpeg2k_encode` Will post back once I confirm it works as expected — anbk, Apr 02 '21 at 18:08
`apply_modality_lut/apply_voi_lut()` are for displaying DICOM data yourself, e.g. creating drawing data from the image data - the DICOM viewers will do this themselves, so they should get the unchanged values. — MrBean Bremen, Apr 02 '21 at 18:31

anbk · Answer 1 · 2021-04-02T21:34:27.243

Was able to get this working by using the imagecodecs library and the jpeg2k_encode function. One potential pitfall is you need to .copy() the array to meet the function's C contiguous requirement, which you can confirm by running arr_crop.flag if you needed to. Here is the updated code that worked best for me:

import os
import numpy as np
import matplotlib.pyplot as plt
import pydicom
from pydicom.encaps import encapsulate
from pydicom.uid import JPEG2000
from imagecodecs import jpeg2k_encode

file_list = [f.path for f in os.scandir(basepath)]
ds = pydicom.dcmread(file_list[0])
arr = ds.pixel_array
#Need to copy() to meet jpeg2k_encodes C contiguous requirement
arr_crop = arr[500:1500,500:1500].copy() 
# jpeg2k_encode to perform JPEG2000 compression
arr_jpeg2k = jpeg2k_encode(arr_crop)
# convert from bytearray to bytes before saving to PixelData
arr_jpeg2k = bytes(arr_jpeg2k)
ds.Rows = arr_crop.shape[0]
ds.Columns = arr_crop.shape[1]
ds.PixelData = encapsulate([arr_jpeg2k])
outputpath = os.path.join(basepath, "test.dcm")
ds.save_as(outputpath)

I also ended up using the interactivecrop package to relatively quickly get the crop indices I needed (a tip in case future folks try this in jupyter). In case it's helpful, here's a snippet of code for that (which is run before the above):

from interactivecrop.interactivecrop import main as crop
file_names = [os.path.split(f)[1].split(".")[0] for f in file_list]
image_list = []
for x in file_list:
    ds = pydicom.dcmread(x)
    arr = ds.pixel_array
    image_list.append(arr)
crop(image_list, file_names, optimize=True)
#After cropping all images, will print a dictionary
#copied and pasted this dictionary to a new cell as crop_dict
#use the below function to convert the output to actual indices
def convert_crop_to_index(fname, crop_dict):
    x = [crop_dict[fname][1], crop_dict[fname][1] + crop_dict[fname][3]]
    y = [crop_dict[fname][0], crop_dict[fname][0] + crop_dict[fname][2]]
    return x, y
arr_crop = arr[x[0]:x[1],y[0]:y[1]].copy()

Was never able to quite figure out why ds.decompress() and saving the decompressed dicom was generating an all black image. I feel like that should have been the easiest method, but the above ended up working for me, so I'm happy was able to figure it out.

There seems to be an issue with GDCM and JPEG2K Pixel Data that include the JP2 header (which they're not supposed to). See [this issue](https://github.com/pydicom/pydicom/issues/1320) for another example. — scaramallion, Apr 05 '21 at 23:27
That's interesting, thanks for sharing! Had no idea how to deal with the byte form of PixelData directly. It may not be related to this particular issue since I am not having a problem viewing/decoding the jpeg2k encoded data, but rather the writing/encoding part. — anbk, Apr 06 '21 at 16:09

Trying to crop image and save dicom with pydicom,

1 Answers1