I have created a Machine learning model using python. By default, Random Forest uses 0.5 as a threshold to assign "Yes" or "NO" (means if the probability of that record is more than 50% then it will assign it to YES otherwise NO).
Therefore, I just want to know how we can determine the optimum threshold of the trained model (means at what cutoff value we will get the maximum "YES") so that I can improve the performance of the model.
In R people uses a loop to determine the optimum threshold. so I wanted to know how we can do it in python.
Below is the R code for the same -
perform_fn_rf <- function(cutoff)
{
predicted_response <- as.factor(ifelse(rf_pred[, 2] >= cutoff, "YES", "NO"))
conf <- confusionMatrix(predicted_response, train_validation$Outcome.Status, positive = "YES")
acc <- conf$overall[1]
sens <- conf$byClass[1]
spec <- conf$byClass[2]
OUT_rf <- t(as.matrix(c(sens, spec, acc)))
colnames(OUT_rf) <- c("sensitivity", "specificity", "accuracy")
return(OUT_rf)
}