Recall Precision Curve for clustering algorithms

Question

I would like to know whether precision recall curve is relevant for clustering algorithms. For example by using unsupervised learning techniques such as Mean shift or DBSCAN.(Or is it relevant only for classification algorithms). If yes how to get the plot points for low recall values? Is it allowed to change the model parameters to get low recall rates for a model?

score 0 · Answer 1 · answered May 29 '17 at 06:53

PR curves (and ROC curves) require a ranking.

E.g. a classificator score that can be used to rank objects by how likely they belong to class A, or not.

In clustering, you usually do not have such a ranking.

Without a ranking, you don't get a curve. Also, what is precision and recall in clustering? Use ARI and NMI for evaluation.

But there are unsupervised methods such as outlier detection where, e.g., the ROC curve is a fairly common evaluation method. The PR curve is more problematic, because at 0 it is not defined, and ton shouldn't linearly interpolate. Thus, the popular "area under curve" is not well defined for PR curves. Since there are a dozen of other measures, I'd avoid PR-AUC because of this.

Recall Precision Curve for clustering algorithms

1 Answers1