3

I am trying to read DICOM files using pydicom in Python and want to store the header data into a pandas dataframe. How do I extract the data element value for this purpose?

So far I have created a dataframe with columns as the tag names in the DICOM file. I have accessed the data element but I only need to store the value of the data element and not the entire sequence. For this, I converted the sequence to a string and tried to split it. But it won't work either as the length of different tags are different.

refDs = dicom.dcmread('000000.dcm')
    info_header = refDs.dir()

    df = pd.DataFrame(columns = info_header)
    print(df)

    info_data = []
    for i in info_header:
        if (i in refDs):
            info_data.append(str(refDs.data_element(i)).split(" ")[0])

    print (info_data[0],len(info_data))

I have put the data element sequence element in a list as I could not put it into the dataframe directly. The output of the above code is

(0008, 0050) Accession Number                    SH: '1091888302507299' 89

But I only want to store the data inside the quotes.

Amit Joshi
  • 15,448
  • 21
  • 77
  • 141
Ashutosh Kumar
  • 109
  • 1
  • 9

1 Answers1

4

This works for me:

import pydicom as dicom
import pandas as pd

ds = dicom.read_file('path_to_file')
df = pd.DataFrame(ds.values())
df[0] = df[0].apply(lambda x: dicom.dataelem.DataElement_from_raw(x) if isinstance(x, dicom.dataelem.RawDataElement) else x)
df['name'] = df[0].apply(lambda x: x.name)
df['value'] = df[0].apply(lambda x: x.value)
df = df[['name', 'value']]

Eventually, you can transpose it:

df = df.set_index('name').T.reset_index(drop=True)

Nested fields would require more work if you also need them.

gil-c
  • 51
  • 3
  • How could the tags [(0002, 0000) , (0002, 0001) etc] also be exported to the CSV file? – TTZ Feb 15 '23 at 22:12