I am trying to implement this loos function weighted_loss, as a custom objective in XGBoost, using the sklearn XBGClassifier wrapper, as follows:
def weighted_binary_cross_entropy(dtrain, pred):
# assign the value of imbalanced alpha
imbalance_alpha = 90
# retrieve data from dtrain matrix
print('!!!!')
label = dtrain.get_labe() ### BUG intentionally added here
# compute the prediction with sigmoid
sigmoid_pred = 1.0 / (1.0 + np.exp(-pred))
# gradient
grad = -(imbalance_alpha ** label) * (label - sigmoid_pred)
hess = (imbalance_alpha ** label) * sigmoid_pred * (1.0 - sigmoid_pred)
return grad, hess
xgb_test=XGBClassifier(obj=weighted_binary_cross_entropy).fit(X_trainnorm.to_numpy(), y_train.to_frame().to_numpy())
y_pred=xgb.predict(X_testnorm)
However, it seems to not be using weighted_binary_cross_entropy, since I put in the print statement and purposely added in a bug by dropping the l
in label.
I've seen several posts about confusion wrt obj
versus objective
.
I noticed that for binary classification, use of obj
supposedly only uses the default, so I tried this:
xgb_test=XGBClassifier(objective=weighted_binary_cross_entropy).fit(X_trainnorm.to_numpy(), y_train.to_frame().to_numpy())
but when I run it, I get:
'numpy.ndarray' object has no attribute 'get_label'
I dug into the main module where I grabbed the custom loss function from, and see that dtrain is an xgb.DMatrix object (see dtrain defined here). Do I need to convert the numpy object to a DMatrix or is there a way to just use numpy arrays, espcially since the code snippet uses the native train
method as opposed to the sklearn fit
method.
Any insight to this would be most welcome.