2

I have a tensor in the shape (n_samples, n_steps, n_features). I want to decompose this into a tensor of shape (n_samples, n_components).

I need a method of decomposition that has a .fit(...) so that I can apply the same decomposition to a new batch of samples. I have been looking at Tucker Decomposition and PARAFAC Decomposition, but neither have that crucial .fit(...) and .transform(...) functionality. (Or at least I think they don't?)

I could use PCA and train it on a representative sample and then call .transform(...) on the remaining samples, but I would rather have some sort of tensor decomposition that can handle all of the samples at once, so as to get a better idea of the differences between each sample.

This is what I mean by "tensor":

In fact tensors are merely a generalisation of scalars and vectors; a scalar is a zero rank tensor, and a vector is a first rank tensor. The rank (or order) of a tensor is defined by the number of directions (and hence the dimensionality of the array) required to describe it.

If you have any questions, please ask, I'll try to clarify my problem if needed.

EDIT: The best solution would be some type of kernel but I have yet to find a kernel that can deal with n-rank Tensors and not just 2D data

Ty Pavicich
  • 1,050
  • 3
  • 9
  • 24

1 Answers1

2

You can do this using the development (master) version of TensorLy. Specifically, you can use the new partial_tucker function (it is not yet updated in the documentation...).

Note that the following solution preserves the structure of the tensor, i.e. a tensor of shape (n_samples, n_steps, n_features) is decomposed into a (smaller) tensor of shape (n_samples, n_components_1, n_components_2).

Code

Short answer: this is a very basic class that does what you want (and it would work on tensors of arbitrary order).

import tensorly as tl
from tensorly.decomposition._tucker import partial_tucker

class TensorPCA:
    def __init__(self, ranks, modes):
        self.ranks = ranks
        self.modes = modes

    def fit(self, tensor):
        self.core, self.factors = partial_tucker(tensor, modes=self.modes, ranks=self.ranks)
        return self

    def transform(self, tensor):
        return tl.tenalg.multi_mode_dot(tensor, self.factors, modes=self.modes, transpose=True)

Usage

Given an input tensor, you can use the previous class by first instantiating it with the desired ranks (size of the core tensor) and modes on which to perform the decomposition (in your 3D case, 1 and 2 since indexing starts at zero):

tpca = TensorPCA(ranks=[4, 5], modes=[1, 2])
tpca.fit(tensor)

Given a new tensor originally called new_tensor, you can project it using the transform method:

tpca.transform(new_tensor)

Explanation

Let's go through the code with an example: first let's import the necessary bits:

import numpy as np
import tensorly as tl
from tensorly.decomposition._tucker import partial_tucker

We then generate a random tensor:

tensor = np.random.random((10, 11, 12))

The next step is to decompose it along its second and third dimensions, or modes (as the first dimension corresponds to the samples):

core, factors = partial_tucker(tensor, modes=[1, 2], ranks=[4, 5])

The core corresponds to the transformed input tensor while factors is a list of two projection matrices, one for the second mode and one for the third mode. Given a new tensor, you can project it to the same subspace (the transform method) by projecting each of its last two dimensions:

tl.tenalg.multi_mode_dot(tensor, factors, modes=[1, 2], transpose=True)

The transposition here is equivalent to an inverse since the factors are orthogonal.

Finally, a note on the terminology: in general, even though it is sometimes done, it is probably best to not use interchangeably order and rank of a tensor. The order of a tensor is simply its number of dimensions while the rank of a tensor is usually a much more complicated notion which you could think of as a generalization of the notion of matrix rank.

Jean
  • 355
  • 1
  • 7