0

I am using xgboost with python in order to perform a binary classification in which the class 0 appears roughly 9 times more frequently than the class 1. I am of course using scale_pos_weight=9. However, when I perform the prediction on the testing data after training the model using train_test_split, I obtain a y_pred with twice the elements belonging to the class 1 than it should (20% instead of 10%). How can I correct this output? I thought the scale_pos_weight=9 would be enough to inform the model the expected proportion.

donut
  • 628
  • 2
  • 9
  • 23

1 Answers1

0

Your question seems sketchy: what is y_pred?

+Remember you are better to run a grid search or Bayesian optimizer to figure out the best scores.

  • 1
    Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Dec 25 '22 at 12:59
  • This does not provide an answer to the question. Once you have sufficient [reputation](https://stackoverflow.com/help/whats-reputation) you will be able to [comment on any post](https://stackoverflow.com/help/privileges/comment); instead, [provide answers that don't require clarification from the asker](https://meta.stackexchange.com/questions/214173/why-do-i-need-50-reputation-to-comment-what-can-i-do-instead). - [From Review](/review/late-answers/33483309) – Adrian David Smith Dec 26 '22 at 08:47