Use numpy.einsum to calculate the covariance matrix of data

Question

My aim is to calculate the covariance matrix of a set of data using numpy.einsum. Take for instance

example_data = np.array([0.2, 0.3], [0.1, 0.2]])

The following is code I tried:

import numpy as np

d = example_data[0].shape[1]
mu = np.mean(example_data, axis=0)
data = np.reshape(example_data,(len(example_data),d,1))
mu = np.tile(mu,len(example_data))
mu = np.reshape(mu,(len(example_data),d,1))
d_to_mean = data-mu 

covariance_matrix = np.einsum('ijk,kji->ij', d_to_mean, np.transpose(d_to_mean)) 
#I don't know how to set the subscripts correctly

Any suggestions how to make this approach workable are appreciated!

meTchaikovsky · Accepted Answer · 2020-11-23T01:13:05.873

2

Based on the definition of a covariance matrix, the task can be solved quite easily with

tmp = np.random.rand(5,3) # 5 corresponds to 5 observations, 3 corresponds to 3 variables
tmp_mean = np.mean(tmp,axis=0)[:,None]
tmp_centered = tmp.T - tmp_mean
cov = (tmp_centered @ tmp_centered.T) / (5-1)

If you need einsum anyway

cov_ein = np.einsum('ij,jk->ik',tmp_centered,tmp_centered.T) / (5-1)

edited Nov 23 '20 at 01:13

answered Nov 23 '20 at 01:07

meTchaikovsky

7,478
2
15
34

What is ```(5-1)```/ where is this coming from? – Pazu Nov 23 '20 at 01:11
2

@Pazu You need to divide `(n-1)` in which `n` is the number of observations, in order to compute the expectation of a matrix. – meTchaikovsky Nov 23 '20 at 01:11

Use numpy.einsum to calculate the covariance matrix of data

1 Answers1