sklearn - sample from GaussianMixture without fitting

Question

I would like to use a GaussianMixture for generating random data. The parameters should not be learnt from data but supplied.

GaussianMixture allows supplying inital values for weights, means, precisions, but calling "sample" is still not possible.

Example:

import numpy as np
from sklearn.mixture import GaussianMixture
d = 10
k = 2
_weights = np.random.gamma(shape=1, scale=1, size=k)
data_gmm = GaussianMixture(n_components=k, 
                           weights_init=_weights / _weights.sum(),
                           means_init=np.random.random((k, d)) * 10,
                           precisions_init=[np.diag(np.random.random(d)) for _ in range(k)])
data_gmm.sample(100)

This throws:

NotFittedError: This GaussianMixture instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator.

I've tried:

Calling _initialize_parameters() - this requires also supplying a data matrix, and does not initialize a covariances variable needed for sampling.
Calling set_params() - this does not allow supplying values for the attributes used by sampling.

Any help would be appreciated.

score 1 · Answer 1 · answered Jul 27 '21 at 11:11

1

You can set all the attributes manually so you don't have to fit the GaussianMixture.

You need to set weights_, means_, covariances_ as follow:

import numpy as np
from sklearn.mixture import GaussianMixture
d = 10
k = 2
_weights = np.random.gamma(shape=1, scale=1, size=k)
data_gmm = GaussianMixture(n_components=k)
data_gmm.weights_ = _weights / _weights.sum()
data_gmm.means_ = np.random.random((k, d)) * 10
data_gmm.covariances_ = [np.diag(np.random.random(d)) for _ in range(k)]
data_gmm.sample(100)

NOTE: You might need to modify theses parameters values according to your usecase.

answered Jul 27 '21 at 11:11

Antoine Dubuis

4,974
1
15
29

Is this the most elegant way? I would think sklearn does not require me to know which internal variables should be overridden. – Yiftach Jul 27 '21 at 13:02
Well It is the only way to use scikit-learn's `GaussianMixture` without having to fit it. – Antoine Dubuis Jul 27 '21 at 13:41

sklearn - sample from GaussianMixture without fitting

1 Answers1