I have been using k-Means for clustering a data into 2 classes. However, now, I would like to use a different approach and use Gaussian Mixture Model for Clustering the data into 2 classes. I have gone through Scikit-Learn documentation, and other SO questions, but am unable to understand how I can use GMM for 2 class clustering in my present context.
I am able to easily cluster the data into 2 classes using k-Means as follows:-
import pandas as pd
from scipy import stats
from sklearn.cluster import KMeans
import numpy as np
df = pd.read_pickle('my_df.pkl')
clmns = df.columns
df = df.fillna(df.mean())
df.isnull().any
df_tr_std = stats.zscore(df[clmns])
kmeans = KMeans(n_clusters = 2, random_state = 0, n_init = 100, max_iter=500, n_jobs = -1).fit(df_tr_std)
# >>> kmeans
# KMeans(algorithm='auto', copy_x=True, init='k-means++', max_iter=300,
# n_clusters=2, n_init=10, n_jobs=None, precompute_distances='auto',
# random_state=0, tol=0.0001, verbose=0)
labels = kmeans.labels_
I would appreciate any one liner/short code segment, which I can use to fit a GMM model on my data (df_tr_std
). I am sure that this must be a fairly simple process to fit the GMM model, but I am very confused as to how my current k-Means context can be modified to a GMM one.