-5

We used 100,000 kits. The version of spark is 1.6.1 and scala is 2.1.0. How can I fix memory errors and get good results?

enter image description here

Syfer
  • 4,262
  • 3
  • 20
  • 37

1 Answers1

3

The various DBSCAN addons for Spark are all problematic.

Confer this report:

Neukirchen, Helmut. "Survey and Performance Evaluation of DBSCAN Spatial Clustering Implementations for Big Data and High-Performance Computing Paradigms." (2016).

For JVM languages like Scala, it should be easy to call e.g. ELKI and get a quite good performance.

Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194