I have a 10000 X 22 dimensional array (observations x features) and I fit a gaussian mixture with one component as following:
mixture = sklearn.mixture.GaussianMixture(n_components=1, covariance_type='full').fit(my_array)
Then, I want to calculate the mean and the covariance of the conditional distribution of the first two features over the rest as per Bishop's Pattern Recognition and Machine learning equations 2.81 and 2.82 in p.87. What I do is the following:
covariances = mixture.covariances_ # shape = (1, 22, 22) where 1 is the 1 component I fit and 22x22 is the covariance matrix
means = mixture_component.means_ # shape = (1, 22), 22 means; one for each feautre
dependent_data = features[:, 0:2] #shape = (10000, 2)
conditional_data = features[:, 2:] #shape = (10000, 20)
mu_a = means[:, 0:2] # Mu of the dependent variables
mu_b = means[:, 2:] # Mu of the independent variables
cov_aa = covariances[0, 0:2, 0:2] # Cov of the dependent vars
cov_bb = covariances[0, 2:, 2:] # Cov of independent vars
cov_ab = covariances[0, 0:2, 2:]
cov_ba = covariances[0, 2:, 0:2]
A = (conditional_data.transpose() - mu_b.transpose())
B = cov_ab.dot(np.linalg.inv(cov_bb))
conditional_mu = mu_a + B.dot(A).transpose()
conditional_cov = cov_aa - cov_ab.dot(np.linalg.inv(cov_bb)).dot(cov_ba)
My problem is that on calculating the conditional_mu and the conditional_cov, I'm getting the following shapes:
conditional_mu.shape
(10000, 2)
conditional_cov.shape
(2,2)
I was expecting that the shape of the conditional_mu should be (1,2) because I only want to find the means of the first two features over the rest. Why am I getting a mean for each observation instead?