0

I have fitted a KMeans model and retrived the centroid for the data.

Is there any way that I can use the predict() function using these centroids to initalise the KMeans model but without calling the fit function

I tried to run the following code and ran into this error. Here the jsonl file has a json object as

{ "primary" : [[<some_array>]]}
{ "secondary" : [[<some_array>]]}
models = dict()
for json_str in json_list:
    result = json.loads(json_str)
    models[list(result.keys())[0]] = list(result.values())[0]


from sklearn.cluster import KMeans
k = KMeans(init = np.array(models['primary']))
k.predict(inference_data)
NotFittedError: This KMeans instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator.

This problem is well handeled in cuml version of KMeans but how to get it done with sklearn.

Aman Rai
  • 19
  • 6

1 Answers1

0

You can serialize the object. You can use either pickle or joblib. Sklearn's preferred way is to use joblib.

from joblib import dump, load

k = KMeans()

# fit and stuff here

dump(k, 'filename.joblib')

# later on

k = load('filename.joblib') 

As for setting values on the object like you're asking, you could try setting all of the attributes that K-means defines after fitting. You'd have to save all of this data yourself though and using pickle or joblib is easier.

Psuedo code below. Everything on the right of the equals sign would have to be saved.

k = KMeans.__new__(KMeans)

k.cluster_centers_ = best_centers
k._n_features_out = self.cluster_centers_.shape[0]
k.labels_ = best_labels
k.inertia_ = best_inertia
k.n_iter_ = best_n_iter
K. Shores
  • 875
  • 1
  • 18
  • 46