-2

I'm looking for the exact value of epsilon to run the DBSCAN clustering algorithm.

Here's the KNN distance plot.

KNN distplot

This chart has two flex points. I need the second flex point. I'm using the following code:

# evaluate kNN distance
dist <- dbscan::kNNdist(iris, 4)

# order result
dist <- dist[order(dist)]

# scale
dist <- dist / max(dist)

# derivative
ddist <- diff(dist) / ( 1 / length(dist))

# get first point where derivative is higher than 1
knee <- dist[length(ddist)- length(ddist[ddist > 1])]

How can improve my code to get second point where derivative is higher than 1?

Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194

2 Answers2

0

Sorry, can't reproduce your code (perhaps because of package version conflict), but it seems that which function would help you to find all positions, where ddist > 1: which(ddist > 1)

0

Because the x and y axes are not comparable, this approach doesn't work. It's too sensitive to scale.

Finding the elbow/knee/inflexion is subjective and cannot reliably be automated.

Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194