Using numpy.reshape
helped a lot and using map
helped a little. Is it possible to speed this up some more?
import pydicom
import numpy as np
import cProfile
import pstats
def parse_coords(contour):
"""Given a contour from a DICOM ROIContourSequence, returns coordinates
[loop][[x0, x1, x2, ...][y0, y1, y2, ...][z0, z1, z2, ...]]"""
if not hasattr(contour, "ContourSequence"):
return [] # empty structure
def _reshape_contour_data(loop):
return np.reshape(np.array(loop.ContourData),
(3, len(loop.ContourData) // 3),
order='F')
return list(map(_reshape_contour_data,contour.ContourSequence))
def profile_load_contours():
rs = pydicom.dcmread('RS.gyn1.dcm')
structs = [parse_coords(contour) for contour in rs.ROIContourSequence]
cProfile.run('profile_load_contours()','prof.stats')
p = pstats.Stats('prof.stats')
p.sort_stats('cumulative').print_stats(30)
Using a real structure set exported from Varian Eclipse.
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 12.165 12.165 {built-in method builtins.exec}
1 0.151 0.151 12.165 12.165 <string>:1(<module>)
1 0.000 0.000 12.014 12.014 load_contour_time.py:19(profile_load_contours)
1 0.000 0.000 11.983 11.983 load_contour_time.py:21(<listcomp>)
56 0.009 0.000 11.983 0.214 load_contour_time.py:7(parse_coords)
50745/33837 0.129 0.000 11.422 0.000 /home/cf/python/venv/lib/python3.5/site-packages/pydicom/dataset.py:455(__getattr__)
50741/33825 0.152 0.000 10.938 0.000 /home/cf/python/venv/lib/python3.5/site-packages/pydicom/dataset.py:496(__getitem__)
16864 0.069 0.000 9.839 0.001 load_contour_time.py:12(_reshape_contour_data)
16915 0.101 0.000 9.780 0.001 /home/cf/python/venv/lib/python3.5/site-packages/pydicom/dataelem.py:439(DataElement_from_raw)
16915 0.052 0.000 9.300 0.001 /home/cf/python/venv/lib/python3.5/site-packages/pydicom/values.py:320(convert_value)
16864 0.038 0.000 7.099 0.000 /home/cf/python/venv/lib/python3.5/site-packages/pydicom/values.py:89(convert_DS_string)
16870 0.042 0.000 7.010 0.000 /home/cf/python/venv/lib/python3.5/site-packages/pydicom/valuerep.py:495(MultiString)
16908 1.013 0.000 6.826 0.000 /home/cf/python/venv/lib/python3.5/site-packages/pydicom/multival.py:29(__init__)
3004437 3.013 0.000 5.577 0.000 /home/cf/python/venv/lib/python3.5/site-packages/pydicom/multival.py:42(number_string_type_constructor)
3038317/3038231 1.037 0.000 3.171 0.000 {built-in method builtins.hasattr}
Much of the time is in convert_DS_string
. Is it possible to make it faster? I guess part of the problem is that the coordinates are not stored very efficiently in the DICOM file.
EDIT:
As a way of avoiding the loop at the end of MultiVal.__init__
I am wondering about getting the raw double string of each ContourData and using numpy.fromstring
on it. However, I have not been able to get the raw double string.