Feature extraction of 3D image dataset

Question

Assume a workflow for 2D image feature extraction by using SIFT, SURF, or MSER methods followed by bag-of-words/features encoded and subsequently used to train classifiers.

I was wondering if there is an analogous approach for 3D datasets, for example, a 3D volume of MRI data. When dealing with 2D images, each image represents an entity with features to be detected and indexed. However, in a 3D dataset is it possible to extract features from the three-dimensional entity? Does this have to be done slice-by-slice, by decomposing the 3D images to multiple 2D images (slices)? Or is there a way of reducing the 3D dimensionality to 2D while retaining the 3D information?

Any pointers would be greatly appreciated.

What does the have to do with python or matlab? – Scott Hunter Apr 15 '16 at 01:49 — Scott Hunter, Apr 15 '16 at 01:49

score 2 · Answer 1 · answered Mar 21 '17 at 12:09

You can perform feature extraction by passing your 3D volumes through a pre-trained 3D convolutional neural network. Because pre-trained 3D CNNs are hard to find, you could consider training your own on a similar, but distinct, dataset.

Here is a link for code for a 3D CNN in Lasagne. The authors use 3D CNN versions of VGG and Resnet.

Alternatively, you can perform 2D feature extraction on each slice of the volume and then combine the features for each slice, using PCA to reduce the dimensionality to something reasonable. For this, I recommend using ImageNet pre-trained Resnet-50 or VGG.

In Keras, these can be found here.

score 0 · Answer 2 · answered Mar 21 '17 at 22:05

Assume a grey-scale 2D image which can mathematically be described as a matrix. Generalizing the concept of a matrix results in theory about tensors (informally you can think of a multidimensional array). I.e. a RGB 2D image is represented as a tensor of size [width, height, 3]. Further a RGB 3D Image is represented as a tensor of size [width, height, depth, 3]. Moreover and like in the case of matrices you can also perform tensor-tensor multiplications.

For instance consider the typical neural network with 2D images as input. Such a network does basically nothing else than matrix-matrix multiplications (despite of the elementwise non-linear operations at nodes). In the same way a neural network operates on tensors by performing tensor-tensor multiplications.

Now back to your question of feature extraction: Indeed the problem of tensors are their high dimensionality. Hence modern research problems regard the efficient decomposition of tensors retaining the initial (most meaningful) information. In order to extract features from tensors a tensor decomposition approach might be a good start in order to reduce the rank of the tensor. A few papers on tensors in machine learning are:

Tensor Decompositions for Learning Latent Variable Models

Supervised Learning With Quantum-Inspired Tensor Networks

Optimal Feature Extraction and Classification of Tensors via Matrix Product State Decomposition

Hope this helps, even though the math behind is not easy.

Feature extraction of 3D image dataset

2 Answers2