0

I have a binary classification XGBTree model. The data frame used to train the model contains many independent variables(x) and I want to optimize one x to improve the chances that the result becomes 1.

I wonder how it can be achieved? I have searched for default optim function but seems like it can only solve equation but XGBTree model does not have an equation for me to enter. Same as Gurobi, many examples I saw all require an equation.

Is there anyway I can optimize with XGBTree model? If so, how can I implement such method? The code I used to train XGBTree is as follows.

Thank you.

xgb_grid<-expand.grid(
    nrounds = 500,
    max_depth = 5,
    eta = c(0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.1, 0.11, 0.12),
    gamma = 0.3,
    colsample_bytree = 0.25,
    min_child_weight = 2,
    subsample = 0.5
)

xgb <- train(y ~ .,                 # model specification
             data = train,        # train set used to build model
             method = "xgbTree",    # type of model you want to build
             trControl = ctrl,      # how you want to learn
             tuneGrid = xgb_grid,   # tune grid
             metric = "ROC",        # performance measure
             verbose = TRUE
)

Some real examples how it can be achieved.

  • It's a bit unclear what you're asking here. What do you mean by "optimize" here, how is it different from the tuning of (hyper)parameters that happens in training and the model fitting that happens once the tuning parameters have been chosen? – Marius Feb 14 '19 at 03:24
  • For example, one observation's probability by XGBTree model is 0.4. I want to optimize the x variable I'm interested in the data frame to improve the chance of 0.4 to let's say 0.6. – Nelson Chou Feb 14 '19 at 03:47
  • I think there is a little confusion with the meaning of fitting a predictive model. Namely, given a response variable `y` and a set of predictor variables (`x`) the model aims at finding a set of rules or combination of the `x`'s that predicts the value of `y` with minimum "error". These "rules" are obtained by **optimizing a pre-specified objective measure** that is computed on the whole training sample (for instance by _minimizing_ an error measure). In the above settings the value of the `x`'s are fixed and therefore they _cannot be optimized_. What you optimize is the objective measure. – mastropi Feb 14 '19 at 22:04

0 Answers0