1

I trained a XGBClassifier for my classification problem and did Hyper-parameter tuning over huge grid(probably tuned every possible parameter) using optuna. While testing, change of random_state changes model performance metrics (roc_auc/recall/precision), feature_importance and even model predictions (predict_prob).

  1. What does this tell me about my data?

Since I have to take this model in production, how should I tackle this for model to be more robust?

  1. Stay with one random_state (say default 0) which we use during cross_validation and use it on out-of-sample as well.
  2. During cross_validation, on top of each param_combination, run few random_state(say 10) and take avg model performance.
dgomzi
  • 106
  • 1
  • 14
  • Do you subsample while training? If that's the case you will of course randomly pick different variables according to the different seed you choose. This might also be hinted by the different importance results. – CAPSLOCK Dec 13 '19 at 08:47
  • Yes subsample is one of parameters to optimize. How should we tell our model is robust even with all this randomness? – dgomzi Dec 13 '19 at 09:05

3 Answers3

1

These are my two cents. Take the answer with a grain of salt.

The XGB classifier is a boosting algorithm, which naturally depends on randomness (so is a Random Forest for example). Hence, changing seed will inherently change the training of the model and its output.
Different seeds will also change the CV splits and alter further the results.

Further, boosting aims to reduce variance as it uses multiple models (bagging) and at the same time it reduces bias as it trains each subsequent model based on the previous models' errors (the boosting part). However, boosting models can, in principle, overfit.
In fact, if your base learner is not weak it will easily overfits the data and there won't be any residuals or errors for the subsequent models to build upon.

Now, for your problem, you should first verify that you are not overfitting your model to the data.

Then you might want to fix a certain number of seeds (you still want to be able to reproduce the results so it's important to fix them) and average the results obtained across the seeds.

CAPSLOCK
  • 6,243
  • 3
  • 33
  • 56
  • hi i have fixed random_state in xgbclassifier but each time i run my accuracy is different, i fix random state for cv in stratifiedkfold as well as train test split. Is there anything i am not fixing? – Maths12 Feb 04 '22 at 12:15
1

I tend to think if the model is sensitive to the random seed, it isn't a very good model. With XGB can try and add more estimators - that can help make it more stable.

For any model with a random seed, for each candidate set of parameter options (usually already filtered to a shortlist of candidate), I tend to run a bunch of repeats on the same data for different random seeds and measure the difference in the output. I expect the evaluation metric standard deviation to be small (relative to mean), and the overlap of the predictions in each class to be very high. If either of these is not the case I don't accept the model. If it is the case, I simply pick one of the candidate models at random - it should not matter what the random seed is!

I still record the random seed used - this is still needed to recreate results!

Ken Syme
  • 3,532
  • 2
  • 17
  • 19
0

random_state parameter just helps in replicating results every time you run your model. Since you are using cross_validation, assuming it is k-fold, then all your data will go into train and test and the CV score will be anyways average of the number of folds you decide. I believe you can set on any random_state and quote the results from CV.