I am currently trying to figure if there is a way to get the 95% CI of the AUC in python. Currently, I have a ypred list that contains the highest probability class predictions between the 4 classes I have(so either a 0/1/2/3 at each position) and a yactual list which contains the actual labels at each position. How exactly do I go about bootstrapping samples for multiple classes?
Edit: Currently the way I am calculating the AUC is by doing a one-vs-all scheme, where I take the AUC for each classes versus the rest and averaging those 4 values to get the final AUC.