0

I'd like to (efficiently) evaluate a Gaussian mixture model (GMM) over an (n,d) list of datapoints, given the GMM parameters ($\pi_k, \mu_k, \Sigma_k$). I can't find a way to do this using standard sklearn or scipy packages.

EDIT: assume there is n datapoints, dimension d so (n,d), and GMM has k components, so for example the covariance matrix of the k-th component, \Sigma_k, is (d,d), and altogether \Sigma is (k,d,d).

For example, if you first fit a GMM in sklearn, you can call score_samples, but this only works if I'm fitting to data. Or, in scipy you can run a for-loop over multivariate_normal.pdf with each set of parameters, and do a weighted sum/dot product, but this is slow. Checking the source code of either was not illuminating (for me).

I'm currently hacking something together with n-d arrays and tensor dot products .. oy .. hoping someone has a better way?

stevemo
  • 1,077
  • 6
  • 10

0 Answers0