3

So I am relatively new to the ML/AI game in python, and I'm currently working on a problem surrounding the implementation of a custom objective function for XGBoost.

My differential equation knowledge is pretty rusty so I've created a custom obj function with a gradient and hessian that models the mean squared error function that is ran as the default objective function in XGBRegressor to make sure that I am doing all of this correctly. The problem is, the results of the model (the error outputs are close but not identical for the most part (and way off for some points). I don't know what I'm doing wrong or how that could be possible if I am computing things correctly. If you all could look at this an maybe provide insight into where I am wrong, that would be awesome!

The original code without a custom function is:

    import xgboost as xgb

    reg = xgb.XGBRegressor(n_estimators=150, 
                   max_depth=2,
                   objective ="reg:squarederror", 
                   n_jobs=-1)

    reg.fit(X_train, y_train)

    y_pred_test = reg.predict(X_test)

and my custom objective function for MSE is as follows:

    def gradient_se(y_true, y_pred):
        #Compute the gradient squared error.
        return (-2 * y_true) + (2 * y_pred)

    def hessian_se(y_true, y_pred):
        #Compute the hessian for squared error
        return 0*(y_true + y_pred) + 2

   def custom_se(y_true, y_pred):
        #squared error objective. A simplified version of MSE used as
        #objective function.

        grad = gradient_se(y_true, y_pred)
        hess = hessian_se(y_true, y_pred)
        return grad, hess

the documentation reference is here

Thanks!

jpb
  • 31
  • 1
  • 2

1 Answers1

8

According to the documentation, the library passes the predicted values (y_pred in your case) and the ground truth values (y_true in your case) in this order.

You pass the y_true and y_pred values in reversed order in your custom_se(y_true, y_pred) function to both the gradient_se and hessian_se functions. For the hessian it doesn't make a difference since the hessian should return 2 for all x values and you've done that correctly.

For the gradient_se function you've incorrect signs for y_true and y_pred.

The correct implementation is as follows:

    def gradient_se(y_pred, y_true):
        #Compute the gradient squared error.
        return 2*(y_pred - y_true)

    def hessian_se(y_pred, y_true):
        #Compute the hessian for squared error
        return 0*y_true + 2

   def custom_se(y_pred, y_true):
        #squared error objective. A simplified version of MSE used as
        #objective function.

        grad = gradient_se(y_pred, y_true)
        hess = hessian_se(y_pred, y_true)
        return grad, hess

Update: Please keep in mind that the native XGBoost implementation and the implementation of the sklearn wrapper for XGBoost use a different ordering of the arguments. The native implementation takes predictions first and true labels (dtrain) second, while the sklearn implementation takes the true labels (dtrain) first and the predictions second.

Qqbt
  • 802
  • 2
  • 8
  • 33
  • According to the documentation, the order is `y_true`, `ypred`. See the Note. https://xgboost.readthedocs.io/en/latest/python/python_api.html#module-xgboost.sklearn But I think the docs are wrong, because all the objective functions implemented here flip the order https://github.com/dmlc/xgboost/blob/master/demo/guide-python/custom_objective.py – Pavel Komarov Jul 12 '21 at 16:32
  • Actually, from my experimentation, I'm seeing that `y_true`, `y_pred` is the correct ordering now. – Pavel Komarov Jul 12 '21 at 17:59
  • @PavelKomarov both the docs and the sample code show that preds is first and dtrain is second (dtrain has the labels), see https://github.com/dmlc/xgboost/blob/778135f6575ec8735cf899748d003446d1e3a51d/demo/guide-python/custom_objective.py#L23 – Qqbt Jul 23 '21 at 08:32
  • 1
    I don't know what to tell you, bub. I printed the info coming in to the function, and it's clear the first argument holds the true targets, while the second is the model predictions (way off targets at first, but get closer). I'm using the sklearn wrapper, XGBRegressor, and I'm seeing both args come in as numpy arrays, rather than preds as an xgboost data matrix. – Pavel Komarov Jul 24 '21 at 10:53
  • Oh, you didn't mention that you're using the sklearn wrapper! The sklearn wrapper flips the arguments https://github.com/dmlc/xgboost/blob/778135f6575ec8735cf899748d003446d1e3a51d/python-package/xgboost/sklearn.py#L61. Keep in mind that the original question is NOT using the sklearn wrapper and with the original code it was passing the arguments incorrectly, since it was not using the sklearn wrapper interface, so your particular comment here is just plain wrong, bub. – Qqbt Jul 25 '21 at 17:09
  • lol. Sorry for the confusion. The first link if my first comment takes you to the sklearn API. It's kind of crazy that the order is flipped. XGBoost should be more consistent. – Pavel Komarov Jul 26 '21 at 17:37
  • no worries! I'll update the answer with the info – Qqbt Jul 28 '21 at 09:14
  • Hi - +1 to both OP Q and this answer, they have been very helpful. I had a follow-up question: I'm using the sklearn wrapper (only in this case `XGBClassifier`), my function is receiving `y_true` and `y_pred` in the correct order, but for some reason I'm getting a 2-dimensional `ndarray` for `y_pred` where each row is `[0.5, 0.5, 0.5, 0.5]` - at least in the first iteration. Any idea what this could mean and I how I can get the actual predictions? Thanks! – sparc_spread Aug 04 '21 at 16:33
  • @sparc_spread Without seeing the code, I'm guessing that you're doing multi-class classification with 4 classes. I'm not exactly sure why you're getting 0.5 for all 4 classes, because they should sum to 1, but if you make a new question with your code I can take a look. – Qqbt Oct 02 '21 at 09:24