DICOM: how to resample multi modality data with different origins?

Question

I have 2 sets of DICOM image data for 1 subject, consisting of a PET scan and CT scan which were taken at the same time. The Frame of Reference UIDs are different, which I think means that their reference origins are different. So that the 'Image Position Patient' tag can't be compared.

What I want to do is resample both images such that their spatial dimensions are equal and their pixel dimensions are equal. The task seems relatively straightforward, but for the fact that their origins are different.

Download link for data

If they don't share the same Frame of Reference, you will have to register the two datasets using either fiducial markers or the image data itself. — Suever, Mar 07 '17 at 19:34
Can you provide a pair of example images with the relevant origins? If you provide a link, a user with more reputation can add the images to the question. — Cecilia, Mar 07 '17 at 19:55
@Cecilia, I added the links. Please let me know if there is any other information that should be added. — user152153, Mar 07 '17 at 22:58
@user152153 just adding to Suever's comment, since the two images were taken simultaneously, in theory a simple affine registration is all you need. Simply identify a minimum of three landmarks on both images and then calculate the 3x3 transformation matrix between the two. — Tasos Papastylianou, Mar 08 '17 at 01:08
@TasosPapastylianou - Thanks for your comment! Is this what happens (do you suppose) 'behind the scenes' when commercial medical imaging software, overlays data from 1 modality onto another? (assuming once again, that the data has different physical dimensions and different origins) — user152153, Mar 08 '17 at 19:21
@user152153 if taken at the same time on the same hybrid machine, then yes. If not, then you're looking at more elaborate registration approaches. — Tasos Papastylianou, Mar 08 '17 at 22:18
@TasosPapastylianou, if you don't mind, I'm still a bit confused, does a landmark mean a pixel or an ROI? also, why 3? What makes sense to me is something like this example: http://imgur.com/a/WP7At where (x_new, y_new, z_new) would be the transformed coordinates of Image A into the space of Image B. This wouldn't work? — user152153, Mar 08 '17 at 23:29
apologies you're right, I forgot about translation. you'll need 4 landmarks. I'll write a proper answer why. — Tasos Papastylianou, Mar 09 '17 at 02:39

score 2 · Accepted Answer · answered Mar 09 '17 at 04:21

For any two images A and B deemed to represent the same object, registration is the act of identifying for each pixel / landmark in A the equivalent pixel / landmark in B.
Assuming each pixel in both A and B can be embedded in a coordinate system, registration usually entails transforming A such that after the transformation, the coordinates of each pixel in A coincide with those of the equivalent pixel in B (i.e. the objective is for the two objects overlap in that coordinate space)
An isometric transformation is one where the distance between any two pixels in A, and the distance between the equivalent two pixels in B does not change after the transformation has been applied. For instance, rotation in space, reflection (i.e. mirror image), and translation (i.e. shifting the object in a particular direction) are all isometric transformations. A registration algorithm applying only isometric transformations is said to be rigid.
An affine transformation is similar to an isometric one, except scaling may also be involved (i.e. the object can also grow or shrink in size).
In medical imaging If A and B were obtained at different times, it is highly unlikely that the transformation is a simple affine or isometric one. For instance, say during scan A the patient had their arms down by their side, and in scan B the patient had their arms over their head. There is no rigid registration of A that would result in perfect overlap with B, since distances between equivalent points have changed (e.g. the distance between head-to-hand, and hand-to-foot in each case). Therefore more elaborate non-rigid registration algorithms would need to be used.
The fact that in your case A and B were obtained during the same scanning session in the same machine means that it's a reasonable assumption that the transformation will be a simple affine one. I.e. you will probably only need to rotate and translate the object a bit; if the coordinate system of A is 'denser' than B, you might also need to grow / shrink it a bit. But that's it, no weird 'warping' will be necessary to compensate for 'movement' occurring between scans A and B being obtained, since they happened at the same time.
A 3D vector, denoting a 'magnitude and direction' in 3D space can be transformed to another 3D vector using a 3x3 transformation matrix T. For example, if you apply transformation to vector (using matrix multiplication), the resulting vector u is . In other words, the 'new' x-coordinate depends on the old x, y, and z coordinates in a manner specified by the transformation matrix, and similarly for the new y and new z coordinates.
If you apply a 3x3 transformation T to three vectors at the same time, you'll get three transformed vectors out. e.g. for v = [v1, v2, v3] where v1 = [1; 2; 3], v2 = [2; 3; 4], v3 = [3; 4; 5], then T*v will give you a 3x3 matrix u, where each column corresponds to a transformed vector of x,y,z coordinates.
Now, consider the transformation matrix T is unknown and we want to discover it. Say we have a known point and we know that after the transformation it becomes a known point . We have:

Consider the top row; even if you know p and p', it should be clear that you cannot determine a, b, and c from a single point. You have three unknowns and only one equation. Therefore to solve for a, b, and c, you need at least a system of three equations. The same applies for the other two rows. Therefore, to find the transformation matrix T you need three known points (before and after transformation).
In matlab, you can solve such a system of equations where T*v = u, by typing T = u/v. For a 3x3 transformation matrix T, u and v need to contain at least 3 vectors, but they can contain more (i.e. the system of equations is overrepresented). The more vectors you pass in, the more accurate the transformation matrix from a numerical point of view. But in theory you only need three.
If your transformation also involves a translation element, then you need to do the trick described in the image you posted. I.e. you represent a 3D vector [x,y,z] as a homogeneous-coordinates vector [x,y,z,1]. This enables you to add a 4th column in your transformation matrix, which results in a 'translation' for each point, i.e. adding an extra value in the new x', y' and z' coefficients, which is independent of the input vector. Since the translation coefficients are also unknown, you now have 12 instead of 9 unknowns, and therefore you need 4 points to solve this system. i.e.

To summarise:

To transform your image A to occupy the same space as B, interpret the coordinates of A as if they were in the same coordinate system as B, find four equivalent landmarks in both, and obtain a suitable transformation matrix as above by solving this system of equations using the / right matrix division operator. You can then use this transformation matrix T you found, to transform all the coordinates in A (expressed as homogeneous coordinates) to the new ones.

Thanks - this is an awesome answer! Only one question left, well more like a request for advice; since I don't know (and can't know (?) without solving the system) if translation is actually necessary, it seems to make sense to hedge my bets and find 4 landmarks in order to solve the 4x4 system? — user152153, Mar 09 '17 at 18:39
Ok- apologies- 2 questions! Is the above process essentially equivalent to resampling? i.e. after I have found and applied the transformation matrix, will image A have the same spatial and voxel dimensions as image B (supposing again that before the transformation image A has smaller physical dimensions than image B)? — user152153, Mar 09 '17 at 18:49
Re: translation, yes, you might as well. If no translation is involved then the coefficients t_x, t_y, and t_z will simply turn out to be zero (in theory, anyway). — Tasos Papastylianou, Mar 09 '17 at 22:43
Re: resampling, not quite. If image A is M x N pixels and B is K x L, your transformed A will still be M x N, but each pixel will be in the 'correct' position in terms of the coordinate space. But the M x N points will not necessarily fall on 'integer' coordinates like B. If you want to evaluate A at the same (integer) coordinates as the K x L pixels in B, to compare the two images on a pixel-by-pixel basis, then yes, you will have to resample (or more specifically, _interpolate_ ) A to find the equivalent values at those integer coordinates. — Tasos Papastylianou, Mar 09 '17 at 22:56

DICOM: how to resample multi modality data with different origins?

1 Answers1