I have 1D data (on column data). I used Gaussian Mixture Model (GMM) as a density estimation, using this implementation in Python: https://scikit-learn.org/stable/modules/generated/sklearn.mixture.GaussianMixture.html. By relying on AIC/BIC criteron i was able to determine number of components. After i fit the GMM, i plotted kernel density estimation of original observation + that of sampled data drawn from GMM. the plot of original and sampled desnities are quiet similar( that is good). But, i would like some metrics to report how good is the fitted model.
g = GaussianMixture(n_components = 35)
data= df['x'].values.reshape(-1,1) # data taken from data frame (10,000 data pints)
clf= g.fit(data)# fit model
samples= clf.sample(10000)[0] # generate sample data points (same # as original data points)
I found score in the implementation, but not sure how to implememnt. Am i doing it wrong? or is there any better way to show how accuracy is the fitted model, apart from histogram or kernel densities plots?.
print(clf.score(data))
print(clf.score(samples))