Efficient Gaussian mixture evaluation

Asked Oct 03 '18 at 01:29

Active Oct 03 '18 at 02:14

Viewed 179 times

I'd like to (efficiently) evaluate a Gaussian mixture model (GMM) over an (n,d) list of datapoints, given the GMM parameters ($\pi_k, \mu_k, \Sigma_k$). I can't find a way to do this using standard sklearn or scipy packages.

EDIT: assume there is n datapoints, dimension d so (n,d), and GMM has k components, so for example the covariance matrix of the k-th component, \Sigma_k, is (d,d), and altogether \Sigma is (k,d,d).

For example, if you first fit a GMM in sklearn, you can call score_samples, but this only works if I'm fitting to data. Or, in scipy you can run a for-loop over multivariate_normal.pdf with each set of parameters, and do a weighted sum/dot product, but this is slow. Checking the source code of either was not illuminating (for me).

I'm currently hacking something together with n-d arrays and tensor dot products .. oy .. hoping someone has a better way?

edited Oct 03 '18 at 02:14

asked Oct 03 '18 at 01:29

stevemo

1,077
6
10

Is each $\Sigma_k$ a d-by-d covariance matrix? – Warren Weckesser Oct 03 '18 at 01:37
@WarrenWeckesser Yes, thanks, edited for clarity – stevemo Oct 03 '18 at 02:14

Efficient Gaussian mixture evaluation

0 Answers0