0

I would like to use a GaussianMixture for generating random data. The parameters should not be learnt from data but supplied.

GaussianMixture allows supplying inital values for weights, means, precisions, but calling "sample" is still not possible.

Example:

import numpy as np
from sklearn.mixture import GaussianMixture
d = 10
k = 2
_weights = np.random.gamma(shape=1, scale=1, size=k)
data_gmm = GaussianMixture(n_components=k, 
                           weights_init=_weights / _weights.sum(),
                           means_init=np.random.random((k, d)) * 10,
                           precisions_init=[np.diag(np.random.random(d)) for _ in range(k)])
data_gmm.sample(100)

This throws:

NotFittedError: This GaussianMixture instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator.

I've tried:

  • Calling _initialize_parameters() - this requires also supplying a data matrix, and does not initialize a covariances variable needed for sampling.
  • Calling set_params() - this does not allow supplying values for the attributes used by sampling.

Any help would be appreciated.

Yiftach
  • 347
  • 1
  • 2
  • 12

1 Answers1

1

You can set all the attributes manually so you don't have to fit the GaussianMixture.

You need to set weights_, means_, covariances_ as follow:

import numpy as np
from sklearn.mixture import GaussianMixture
d = 10
k = 2
_weights = np.random.gamma(shape=1, scale=1, size=k)
data_gmm = GaussianMixture(n_components=k)
data_gmm.weights_ = _weights / _weights.sum()
data_gmm.means_ = np.random.random((k, d)) * 10
data_gmm.covariances_ = [np.diag(np.random.random(d)) for _ in range(k)]
data_gmm.sample(100)

NOTE: You might need to modify theses parameters values according to your usecase.

Antoine Dubuis
  • 4,974
  • 1
  • 15
  • 29