How to calculate multi class classification AUC with labels?

Question

I am using pROC (in R) with the function multiclass.roc as pointed out at the thread How to plot ROC curves in multiclass classification?

However, when I applied to my data, there is an error:

predictor must be numeric or ordered

Obviously my data label is non - ordered, in this case, how could I calculate AUC?

P/S: The idea is, I have a confusion matrix as the result of a multi-class classifier. How can I calculate AUC for this confusion matrix in R?

Update1:

Let's say I have 4 classes A, B, C, D without order (i.e, does not mean that A > B or B > A)

The correct values:

A A A B B C D A B C D A B C ...

The predicting values:

A B A B B B C D ...

How should I calculate AUC for this data?

Update 2

The code to generate the sample data:

x = c(rep("A",50),rep("B",50),rep("C",50),rep("D",50))
x = as.factor(x)
x_true = sample(x)
x_predict = sample (x)

Then I tried

library (pROC)
multiclass.roc(x_true, x_predict)
Error in roc.default(response, predictor, levels = X, percent = percent,  : 
  Predictor must be numeric or ordered.

Please provide some sample data and code – Feb 03 '16 at 08:21 — , Feb 03 '16 at 08:21

RHertel · Answer 1 · 2016-02-03T11:06:08.650

3

No matter how many classes you are trying to label, a confusion matrix will never be sufficient to calculate the AUC. The confusion matrix is determined by means of a selection of parameters that determine the specificity and sensitivity. It represents only one point on the ROC curve. The ROC contains much more information than a confusion matrix. The AUC is the integral of the ROC curve, and I don't see how this integral could be computed without the ROC.

edited Feb 03 '16 at 11:06

answered Feb 03 '16 at 09:01

RHertel

23,412
5
38
64

Hi, so let's say I use randomForest as the classification, and the result is the probabilities of each class, not only the prediction. Is it enough to calculate AUC? – mamatv Feb 03 '16 at 09:02
I think you could calculate the AUC by using the Gini index if you knew everything about the classification tree, meaning the probabilities for each internal node split. I believe that the final probabilities of the leafs won't be sufficient, but I'll be happy to learn more and remove this answer if somebody convinces me otherwise. – RHertel Feb 03 '16 at 10:59

score -1 · Answer 2 · answered Jun 21 '17 at 09:41

Please notice that there is a way to approximate the AUC having only one point of the curve. It is based on considering that point is connected with the points (0,0) and (1,1):

If you do this, the resulting AUC is

AUC = (1 + TP - FP)/2

where TP is the True Positive Rate and FP is the False Positive Rate (you can check this with basic geometry).

Of course, how to compute multi-class AUC is a different matter.

How to calculate multi class classification AUC with labels?

2 Answers2