I am writing some code for regularized logistic regression. I observe this interesting phenomena and wonder if it is a normal thing or just my code is wrong.
For loss function, I am using the logistic loss function ( maximize likelihood of binary variables). To make prediction, predicted probabilities are obtained for new observations and AUC is used to find the best thresh-hold.
The funny thing is, I often run into cases where an estimated parameter has far better MSE (deviance) than another parameter on new observations but the prediction is worse ( a lot worse). So it seems to me that mean square error might not have anything to do with the prediction performance ( like the case with linear regression). Anyone saw the same thing?