0

I am looking to plot a FPR vs TPR point on an AUC graph for different thresholds.

For example, if data$C2 is my data frame with the true response column (either 0 or 1), I want to make a vector with predicted values (0 or 1) when data$C1 (a different measurement column) is above or below the specified threshold. Here is the function I've attempted with the ROCR package.

 fun <- function (data, col1, col2){

   perfc <- NULL    #Create null vectors for prediction and performance
   perfs <- NULL
   temp <- NULL

 d <- seq(0.10,0.30,0.01)    ##Various thresholds to be tested

  for (i in length(d){

   temp <- ifelse(data[,col1] > d, 1 , 0)  ##Create predicted responses 
   pred <- prediction(temp, data[,col2])  #Predict responses over true values
   perf <- performance(pred, "tpr","fpr") #Store performance information

    predc[i] <- pred #Do this i times for every d in the sequence
    perfc[i] <- perf

   preds <- prediction.class(predc, col2)  #Combine to make prediction class
   perfs <- performance.class(preds, "tpr","fpr") #Combine to make performance class
}

  plot(perfs) #Plot TPR against FPR 
}

Is the problem because temp is a list vector and the true labels are from a matrix? Am I applying this for loop incorrectly?

Thanks in advance!

Edit: Here's my attempt to do this manually without the ROC package.

for(t in seq(0.40,0.60,0.01))  #I want to do this for every t in the sequence
{
  t <- t
  TP <- 0
  FP <- 0
  p <- sum(data$C2==1, na.rm=TRUE)  #Total number of true positives
  n <- sum(data$C2==0, na.rm=TRUE)   #Total number of true negatives
  list <- data$C1 #Column to vector 
  test <- ifelse(list > t, 1, 0)  #Make prediction vector

 for(i in 1:nrow(data))
    {if(test==1 & data$C2==1)
      {TP <- TP + 1}  #Count number of correct predictions
   if(test==1 & data$C2==0) 
      {FP <- FP + 1}   #Count number of false positives
     }
  plot(x=(FP/n),y=(TP/p))    #Plot every FP,TP pair
 }
user2324
  • 3
  • 1
  • 4

1 Answers1

0

I hope I understand your question right, but I think that by AUC graph you mean ROC curve. The ROC curve already takes into account different thresholds to make those classification decisions. See this wikipedia page. I found this picture particularly helpful.

If the above is right, then all you need to do in your code is:

pred <- prediction(data[,col1], data[,col2])  
perf <- performance(pred, "tpr","fpr")  
plot(perf)  

If you would like to 'add' a different curve to that plot, perhaps because you used a different classification technique (e.g. decision tree instead of logistic regression. Then use plot(perf2,add=TRUE). Where perf2 is created in a same way as perf. See the documentation.

Vincent Lous
  • 127
  • 1
  • 1
  • 6
  • Is there anyway to restrict or input the specific thresholds I want to include in the ROC curve? Say t= 0.1 to t=0.3 and see the resulting tpr, fpr, accuracy, etc for each t? – user2324 Nov 15 '15 at 22:03
  • How about the graph in this [gallery](http://rocr.bioinf.mpi-sb.mpg.de/rocr_gallery.html)? – Vincent Lous Nov 15 '15 at 22:26