7

rank-based recommendation system use NDCG to evaluate Recommendation accuracy. However, sometimes Accuracy rate and recall rate are used to evaluate top-n recommendation. Does it mean when NDCG is high, accuracy rate is high? But I run a ListRankMF algorithm, the accuracy rate is very low on movelens 100k dataset, just about 8%. What's the relation between NDCG and accuracy rate?

Tomasz Jakub Rup
  • 10,502
  • 7
  • 48
  • 49
Try Leung
  • 71
  • 2

1 Answers1

1

NDCG is most helpful when the objective of the recommender system is to return some relevant results, and order is important. For example, recommending a translation, or recommending a bank account. It's not harmful if we miss relevant results, but for a good user experience we want them in a meaningful order.

Recall is most helpful when the objective of the recommender system is to return all relevant results, and order is unimportant. For example, a potential medical diagnosis or prescription. It is harmful if we miss a relevant results, since that might be the correct diagnosis or cure. The order is not important since we expect the medic to read through all the possibilities and use their expert knowledge for the final decision.

Suppose there are 5 drugs we could recommend a doctor to give a patient (A to E), and 5 that we should not recommend (F to J). Our recommender system outputs the recommendations A,B,C,D. This gives us the following evaluations:

  • NDCG = 1.0
  • Recall = 0.8

In this case recall clearly shows we did not do as well as we could (since we did not recommend drug E), whereas NDCG is leads us to believe we made the perfect recommendations.

If we were instead recommending books, then NDCG would be more appropriate. Recall is not so informative since there may be hundreds of relevant books, but we cannot expect a user to read through a list of hundreds of books to pick just one to read. NDCG would tell us if we are at least recommending some meaningful subset of what is possible.

Ben Horsburgh
  • 563
  • 4
  • 10