0

In H2O KMeans Cluster. is there a way to calculate the actual distances from the cluster centroids for each point in the data set? Currently H2o Gives the predicted Cluster for the data passed but what the best way of getting the distance of a point from its cluster centroid.

I intend to this for anomaly detection where points found far from the centroid are seen as anomalies. I have dont this using Apache Spark but Intend to try this using Sparking Water but the H2o Api does not seem to show the best way to get distances for each point from the cluster centroid.

1 Answers1

0

Unfortunately, there is not currently a way to do this from R or Python. H2O has a method in Java, but it was never exposed in R/Python, so I have added a ticket for that here.

In the meantime, you could write custom code to do that, or you could use a Deep Learning Autoencoder for anomaly detection (example available in this tutorial).

Erin LeDell
  • 8,704
  • 1
  • 19
  • 35