3

I have some samples (~5000) with their features, and I want to rank them in terms of a score. I have already built a regression model that directly predicts the score, but I still want to try the learning to rank methods, so I turned to the LightGBM Ranker.

Since LightGBM Ranker only accepts label value below 31, I have to group the scores into several categories, 1 to 4 for example. After training, the Ranker is able to rank samples and achieves a nice NDCG@20 score, but it is unable to rank items within the same group.

My problem is somewhat like one query v.s. ~5000 documents, seems a bit different from the ordinary IR problems. It would be perfect if the Ranker accepts the full order or actual scores of my samples as labels, but I don't know how to achieve this. Some posts suggest using the label_gain parameter, but I can't find any documentation on how to set it properly.

I am new to the ranking models, please help. Thanks!

Tom Leung
  • 334
  • 5
  • 18

0 Answers0