-1

I have a dataset consists of 2 million samples. I want to use k-means to cluster this dataset into 2000 clusters. is it ok to use this number of clusters with this data size.

note: feature vector size of each sample is 1000

coder
  • 1
  • 1

1 Answers1

0

To predict the runtime of an algorithm, you can take a look at it's time complexity. This is a formula that relates the run time to some parameters like for instance the data points and number of clusters in k-means. Information about time complexity in k-means clustering can be found here: Computational complexity of k-means