I have implemented Kmeans using spark. But as my data is huge and feature count is very big I want to implement mini batch kmeans using Apache spark MLlib. Is there any example or document on how to implement it?
Asked
Active
Viewed 615 times
1 Answers
0
The paper below doesn't cover apache spark MLlib, but it does walk through minibatch kmeans:
Sculley, David. “Web-Scale K-Means Clustering.” In Proceedings of the 19th International Conference on World Wide Web, 1177–1178. ACM, 2010. http://dl.acm.org/citation.cfm?id=1772862

martis
- 800
- 6
- 10