0

I have a one dimensional data with 10,000 rows.

I like to do some group/clustering of these values.

I was trying to do k-menas clustering but it looks with one variable it's not quite possible.

I have tried to do clustering as follows but it looks getting only o cluster:

import matplotlib.pyplot as plt
from matplotlib import style
style.use("ggplot")

from sklearn.cluster import KMeans

kmeans = KMeans(n_clusters=5)
kmeans.fit(X)

centroid = kmeans.cluster_centers_
labels = kmeans.labels_

print (centroid)
print(labels)
[[  4450.09824891]
 [330963.43209877]
 [ 52145.48325359]
 [634299.        ]
 [146308.2320442 ]]
[0 0 0 ... 0 0 0]

Would anybody Please help me to get the clustering with one variable?

Thanks in advance.

Zerone
  • 127
  • 1
  • 12
  • 1
    Scatter plot is not clustering. Scatter plot visualizes the relationship between two continuous variables. Clustering (e.g k-means is an unsupervised machine learning algorithm) divides your dataset into groups. You have not implemented any code that does k-means clustering. Have a search for how to do that, and it is indeed possible with only one variable. – jjislam Jan 30 '23 at 19:03
  • Hi Joynul Islam, thanks, I have added the clustering code. – Zerone Jan 30 '23 at 19:34
  • The clustering will work even if you only had x1. As for making a scatter plot, you always need two variables. Plotting a scatter plot is not needed to create a cluster. – jjislam Jan 30 '23 at 19:36
  • Without providing some context on your data, it is very hard to give a suggestion – coldy Jan 30 '23 at 19:41
  • Are all of the values in `labels` zeros? –  Jan 31 '23 at 01:47
  • Can you please update your question to include how X is calculated? – Vandan Revanur Jan 31 '23 at 08:46

0 Answers0