1

In R, mclust has an argument 'modelNames' where you can define which model to implement. I wish to do a univariate modeling which is also modelNames <- 'V' in mclust under mixture.GMM in python. However, the only thing I find that I can tweak with is the covariance_type. Nonetheless, when I run the same data using R and mixture.GMM under sklearn, I get different fitting despite the same number of fitted components. What could I change in mixture.GMM to indicate I am using a univariate variable variance?

mclust code:

function(x){Mclust(ma78[x,],G=2,modelNames="V",verbose=FALSE)}

GMM code:

gmm = GMM(n_components = 2).fit(data)
kangaroo_cliff
  • 6,067
  • 3
  • 29
  • 42
Jeff The Liu
  • 61
  • 2
  • 7
  • You should post the code you are using. It would be easy to understand that way. – kangaroo_cliff Apr 05 '18 at 15:21
  • Hi, thank you for the reply. I don't know if this would help. Thank you! Mainly the issue is I do not know what to do under mixture.GMM to make sure it is fitting a univariate model not a multivariate. – Jeff The Liu Apr 05 '18 at 15:25

2 Answers2

0

With univariate data, the covariance can either be equal or unique (variable). With Mclust these options are modelNames = "E" or "V", respectively.

With sklearn, they appear to be covariance_type = "tied" or "full". Possibly, something like this for variable Gaussian mixture model

gmm = mixture.GaussianMixture(n_components = 2, covariance_type='full').fit(data)

Even using Mclust or sklearn alone there can be instanced that you may not get same parameter values for different runs - this is because the estimates can depend on the initial values. One way to avoid this is using a larger number of starts if such option is available.

kangaroo_cliff
  • 6,067
  • 3
  • 29
  • 42
  • Thank you. I like to point out that I solved partially the problem by changing the argument 'tol' to 1e-5 instead of the default 1e-3 which is what the tol is set as for mclust getting me a closer result. Thank you! – Jeff The Liu Apr 05 '18 at 18:14
0

found the answer on stats.stackexchange. The only thing you have to do is to reshape your data data.reshape(-1, 1) before you pass it into sklearn.mixture.GaussianMixture

Andreas

Andreas
  • 716
  • 4
  • 14