We used 100,000 kits. The version of spark is 1.6.1 and scala is 2.1.0. How can I fix memory errors and get good results?
Asked
Active
Viewed 135 times
1 Answers
3
The various DBSCAN addons for Spark are all problematic.
Confer this report:
Neukirchen, Helmut. "Survey and Performance Evaluation of DBSCAN Spatial Clustering Implementations for Big Data and High-Performance Computing Paradigms." (2016).
For JVM languages like Scala, it should be easy to call e.g. ELKI and get a quite good performance.

Has QUIT--Anony-Mousse
- 76,138
- 12
- 138
- 194