1

I am using Streaming k-means to cluster some 2-dimensional stream data using the example in

http://spark.apache.org/docs/latest/mllib-clustering.html#streaming-k-means.

code:

model = StreamingKMeans(k=5, decayFactor=0.7).setRandomCenters(2, 1.0, 0)
model.trainOn(trainingData)
clust=model.predictOnValues(testData.map(lambda lp: (lp.label, lp.features)))

It is working well without error. Now, I need to find and print the cluster center in each batch or over each sliding batch. Considering that the centroids are made with decayFactor of 0.7, how can I find/calculate the cluster centers?

Saeed
  • 357
  • 4
  • 11

0 Answers0