I am using xgboost with python in order to perform a binary classification in which the class 0 appears roughly 9 times more frequently than the class 1. I am of course using scale_pos_weight=9
. However, when I perform the prediction on the testing data after training the model using train_test_split
, I obtain a y_pred
with twice the elements belonging to the class 1 than it should (20% instead of 10%). How can I correct this output? I thought the scale_pos_weight=9
would be enough to inform the model the expected proportion.
Asked
Active
Viewed 118 times
0

donut
- 628
- 2
- 9
- 23
1 Answers
0
Your question seems sketchy: what is y_pred?
+Remember you are better to run a grid search or Bayesian optimizer to figure out the best scores.

Reza Paradise
- 1
- 1
-
1Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Dec 25 '22 at 12:59
-
This does not provide an answer to the question. Once you have sufficient [reputation](https://stackoverflow.com/help/whats-reputation) you will be able to [comment on any post](https://stackoverflow.com/help/privileges/comment); instead, [provide answers that don't require clarification from the asker](https://meta.stackexchange.com/questions/214173/why-do-i-need-50-reputation-to-comment-what-can-i-do-instead). - [From Review](/review/late-answers/33483309) – Adrian David Smith Dec 26 '22 at 08:47