Python best-practices and pydicom dataset attributes question

Question

Is adding new attributes to an existing published module class's instances considered a Bad Practice in general, and instead you should sub-class it?

Background

I'm using pydicom, and want to keep some additional info with the pydicom Datasets (eg file_name for the file it was opened from, scaled_pixel_array for a zoomed pixel array, etc).

My quick approach (rather than sub-classing) has been just adding some additional attributes to the Datasets returned by pydicom, eg:

dcm = pydicom.dcmread('/tmp/1.dcm')

dcm.file_name = '/tmp/1.dcm'   # <- this is an attribute I'm adding
dcm.scaled_pixel_array = <make a zoomed version of the image>

While this works fine, it fails if I try and pickle/de-pickle the Datasets (bug in pydicom 1.3, in pydicom 1.2.2 the added attributes don't appear when de-pickled).

This problem I can work around, or wait until version 1.3.1, but it made me want to ask a general python style question (I'm fairly new to python)

Question

My question is more about general python best practices/style:

Is adding attributes to a public imported module's instance's, like I did above, considered Bad Practice, and instead I should sub-class the pydicom Dataset (or any class in general) to add the new attributes?

Dataset itself already does this -- e.g. `filename` is already an attribute. Are there others that you need? As to general practice, adding attributes does involve some risk, including with pydicom. But it is possible for authors to prevent it using `__slots__`, so if they didn't do that, it may be acceptable to add your own. Use with caution I would say. — darcymason, Oct 31 '19 at 14:01
Thanks Darcy. Is there somewhere documented (I've looked a bunch) what the additional attrib's are? The other main attrib I add btw is (for CT mainly) a pixel value scaled version of pixel_array. For CT the pixel_array values don't have slope/intercept applied to get HU values. I believe I can overwrite pixel_array right? But I'm not sure on the performance impact because I believe it will echo those changes to the raw pixel attrib. ...so it would be nice to be able to get the pixel_array attrib in a version with slope/intercept already applied I think (for my use cases at least) — Richard, Nov 02 '19 at 06:46
Well, I misspoke a little. It's not Dataset but FileDataset (returned from `dcmread`) which has the filename attribute [docs here](https://pydicom.github.io/pydicom/stable/api_ref.html#pydicom.dataset.FileDataset). Looks like there are some formatting issues there, the type (e.g. 'str') is mashed against the attribute name, at least in my browser. — darcymason, Nov 02 '19 at 19:15
You can modify the pixel array any way you like, and it will stay independent of the raw pixel attrib (PixelData); in fact if you write to file again you have to use `ds.PixelData = pix.tobytes()` to set the raw data before writing again if you wish to change the output. Agree it would be nice to have a slope/intercept conversion, that would often be needed by users. Pydicom started as being about granting access to the raw info from files, without much extra intelligence added, but features have built over time, so maybe this could be looked into. — darcymason, Nov 02 '19 at 19:20
BTW, there is some code [here](https://www.raddq.com/dicom-processing-segmentation-visualization-in-python/) for processing sets of slices and converting to HU. — darcymason, Nov 02 '19 at 19:33
Thanks Darcy! Can you replace pixel_array, not just modify the elements? My use case for that is that I often use a spatially scaled version of the pixel data - eg enlarged from 256x256 to 1024x1024. I tried assigning pixel_data to a new (enlarged) array, but that failed. So right now I simply use a separate attributed on each dataset called .scaled_pixel_array. — Richard, Nov 03 '19 at 19:09
Well, you could assign your scaled array using tobytes() as mentioned, then change ds.Rows and ds.Columns to the new size. Then calls to pixel_array should work (assuming you haven't changed bit depth also). Note it is `PixelData` (DICOM keyword style), not pixel_data. But I think adding a `scaled_pixel_array` seems like a very reasonable solution too. — darcymason, Nov 04 '19 at 01:05

Python best-practices and pydicom dataset attributes question

Background

Question

0 Answers0