3

I need to calculate the covariance matrix for RGB values across an image dataset, and then apply Cholesky decomposition to the final result.

The covariance matrix for RGB values is a 3x3 matrix M, where M_(i, i) is the variance of channel i and M_(i, j) is the covariance between channels i and j.

The end result should be something like this:

([[0.26, 0.09, 0.02],
[0.27, 0.00, -0.05],
[0.27, -0.09, 0.03]])

I'd prefer to stick to PyTorch functions even though Numpy has a Cov function.

I attempted to recreate the numpy Cov function in PyTorch here based on other cov implementations and clones:

def pytorch_cov(tensor, tensor2=None, rowvar=True):
    if tensor2 is not None:
        tensor = torch.cat((tensor, tensor2), dim=0)
    tensor = tensor.view(1, -1) if tensor.dim() < 2 else tensor
    tensor = tensor.t() if not rowvar and tensor.size(0) != 1 else tensor
    tensor = tensor - torch.mean(tensor, dim=1, keepdim=True)
    return 1 / (tensor.size(1) - 1) * tensor.mm(tensor.t())

def cov_vec(x):
    c = x.size(0)
    m1 = x - torch.sum(x, dim=[1],keepdims=True)/ c
    out = torch.einsum('ijk,ilk->ijl',m1,m1)  / (c - 1)
    return out

The dataset loading would be like this:

dataset = torchvision.datasets.ImageFolder(data_path)
loader = torch.utils.data.DataLoader(dataset)

for images, _ in loader:
    batch_size = images.size(0) 
    ...

For the moment I'm just experimenting with images created with torch.randn(batch_size, 3, height, width).

Edit:

I'm attempting to replicate the matrix from Tensorflow's Lucid here, and somewhat explained on distill.pub here.

Second Edit:

In order to make the output resemble the example one, you have to do this instead of using Cholesky:

rgb_cov_tensor = rgb_cov_tensor / len(loader.dataset)
U,S,V = torch.svd(rgb_cov_tensor)
epsilon = 1e-10
svd_sqrt = U @ torch.diag(torch.sqrt(S + epsilon))

The resulting matrix can then be used to perform color decorrelation, which is useful for visualizing features (DeepDream). I've implemented it in my project here.

ProGamerGov
  • 870
  • 1
  • 10
  • 23

1 Answers1

3

Here is a function for computing the (unbiased) sample covariance matrix on a 3 channel image, named rgb_cov. Cholesky decomposition is straightforward with torch.cholesky:

import torch
def rgb_cov(im):
    '''
    Assuming im a torch.Tensor of shape (H,W,3):
    '''
    im_re = im.reshape(-1, 3)
    im_re -= im_re.mean(0, keepdim=True)
    return 1/(im_re.shape[0]-1) * im_re.T @ im_re

#Test:
im = torch.randn(50,50,3)
cov = rgb_cov(im)
L_cholesky = torch.cholesky(cov)
Gil Pinsky
  • 2,388
  • 1
  • 12
  • 17
  • 1
    On the diagonal of covariance matrix should be all ones, it is close to one but it is not exactly. Do you know why? – prosti Sep 23 '20 at 00:48
  • @prosti Are you refering to their rgb_cov function or the example result I provided? – ProGamerGov Sep 23 '20 at 00:52
  • 1
    @prosti I added an edit with a link to where I got the example matrix that I'm trying to replicate. Unfortunately I haven't yet figured out how to make the rgb_cov function with Cholesky decomposition get an output like the ImageNet example one I posted (the authors were a bit vague). – ProGamerGov Sep 23 '20 at 01:00
  • 1
    You added very good links (and important) @ProGamerGov – prosti Sep 23 '20 at 01:02
  • @prosti If you manage to figure it out, let me know! – ProGamerGov Sep 23 '20 at 01:08
  • 1
    @prosti That is because it is the sample covariance matrix, which is an estimation of the ground truth covariance matrix – Gil Pinsky Sep 23 '20 at 05:08
  • I added an edit with code that will make the output from the rgb_cov function resemble the ImageNet color correlation matrix. Apparently SVD was used and not Cholesky. – ProGamerGov Sep 24 '20 at 00:46