I'd like to (efficiently) evaluate a Gaussian mixture model (GMM) over an (n,d)
list of datapoints, given the GMM parameters ($\pi_k, \mu_k, \Sigma_k$). I can't find a way to do this using standard sklearn
or scipy
packages.
EDIT: assume there is n
datapoints, dimension d
so (n,d)
, and GMM has k
components, so for example the covariance matrix of the k-th component, \Sigma_k, is (d,d)
, and altogether \Sigma is (k,d,d)
.
For example, if you first fit a GMM in sklearn, you can call score_samples
, but this only works if I'm fitting to data. Or, in scipy
you can run a for-loop over multivariate_normal.pdf
with each set of parameters, and do a weighted sum/dot product, but this is slow. Checking the source code of either was not illuminating (for me).
I'm currently hacking something together with n-d arrays and tensor dot products .. oy .. hoping someone has a better way?