2

I have a X dataset which has 9 features and 683 rows (683x9). I want to take covariance matrix of this X dataset and another dataset which has same shape with X. I use np.cov(originalData, generatedData, rowvar=False) code to get it but it returns a covariance matrix of shape 18x18. I expected to get 9x9 covariance matrix. Can you please help me to fix it.

user3104352
  • 1,100
  • 1
  • 16
  • 34
  • Unless you can be more specific about what's going on here, this is not a question that can be answered. A *minimal* example that demonstrates your problem and helps others reproduce it is best. – tadman Jul 16 '17 at 23:14

1 Answers1

2

The method cov calculates the covariances for all pairs of variables that you give it. You have 9 variables in one array, and 9 more in the other. That's 18 in total. So you get 18 by 18 matrix. (Under the hood, cov concatenates the two arrays you gave it before calculating the covariance).

If you are only interested in the covariance of the variables from the 1st array with the variables from the 2nd, pick the first half of rows and second half of columns:

C = np.cov(originalData, generatedData, rowvar=False)[:9, 9:]

Or in general, with two not necessarily equal matrices X and Y,

C = np.cov(X, Y, rowvar=False)[:X.shape[1], Y.shape[1]:]