How to know to which matrix row corresponds each cluster label?

Question

After doing clustering I end up with an object which stores all the cluster labels, something like this:

clusterer.labels_

The above is typically a list or an array. Then I always assign the labels to the original pandas dataframe (dataset) like this:

df['cluster_lables] = cluster.labels_

At the end I assume that each element of cluster.labels_ corresponds to each row to my original dataset, is that assumption correct? For example from the above column creation I end up with something like this:

ColA ColB cluster_labels
1    3       -1
2    4         2
...
89  90        45

score 1 · Answer 1 · answered Nov 07 '18 at 12:51

Broadly yes, you are right. The type of clustering I have used before is the KMeans clustering (which can be found here https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html) but I can't guarantee they all work like that. Appending a new column onto the dataframe will work as you think it will.

How to know to which matrix row corresponds each cluster label?

1 Answers1