0

I have implemented Kmeans using spark. But as my data is huge and feature count is very big I want to implement mini batch kmeans using Apache spark MLlib. Is there any example or document on how to implement it?

Rahul
  • 645
  • 1
  • 9
  • 21

1 Answers1

0

The paper below doesn't cover apache spark MLlib, but it does walk through minibatch kmeans:

Sculley, David. “Web-Scale K-Means Clustering.” In Proceedings of the 19th International Conference on World Wide Web, 1177–1178. ACM, 2010. http://dl.acm.org/citation.cfm?id=1772862

martis
  • 800
  • 6
  • 10