Streaming Clustering with Unknown Number of Clusters

Asked Apr 29 '16 at 04:51

Active Apr 29 '16 at 04:51

Viewed 250 times

I need to classify a number of data points that will arrive in time. Streaming K-Means would be fine, if I only knew how many different classes (clusters) I might find on my data points. Is there any way to use Spark MLlib 'out of the box' to run a streaming clustering algorithm, in which there is an unknown number of clusters?

asked Apr 29 '16 at 04:51

user1478550

Do you need to experiment and change the number of clusters as data continues to arrive? If so, at what point can you "freeze" the number of clusters? If not, what guidance do you give the algorithm for cluster density and cohesion? – Prune Apr 29 '16 at 23:10

Streaming Clustering with Unknown Number of Clusters

0 Answers0