How to improve the performance of LightGBM Ranker?

Asked Mar 09 '20 at 11:12

Active Mar 09 '20 at 11:12

Viewed 1,507 times

I have some samples (~5000) with their features, and I want to rank them in terms of a score. I have already built a regression model that directly predicts the score, but I still want to try the learning to rank methods, so I turned to the LightGBM Ranker.

Since LightGBM Ranker only accepts label value below 31, I have to group the scores into several categories, 1 to 4 for example. After training, the Ranker is able to rank samples and achieves a nice NDCG@20 score, but it is unable to rank items within the same group.

My problem is somewhat like one query v.s. ~5000 documents, seems a bit different from the ordinary IR problems. It would be perfect if the Ranker accepts the full order or actual scores of my samples as labels, but I don't know how to achieve this. Some posts suggest using the label_gain parameter, but I can't find any documentation on how to set it properly.

I am new to the ranking models, please help. Thanks!

asked Mar 09 '20 at 11:12

Tom Leung

`label_gain` is just an array `label_gain=2**np.arange(0,your_max_number) , default is 31` – Siddhant Tandon Oct 26 '20 at 18:16

How to improve the performance of LightGBM Ranker?

0 Answers0