5

So I just built a star-rating system and and trying to come up with an algorithm to list the "Top Rated" items. For simplicity, here are the columns:

item_name
average_rating (a decimal from 1 to 5)
num_votes

I'm trying to determine the "sweet spot" between number of votes and rating. For example...

  • An item rated (4.6 / 20 votes) should be higher on the list than an item that's (5.0 / 2 votes)
  • An item rated (2.5 / 100 votes) should be below an item that's (4.5 / 2 votes)

So in other words, num_votes plays a factor in what's "Top".

Anyone know of an algorithm that is pretty good at determining this "sweet spot"?

Thanks in advance.

Matt
  • 3,778
  • 9
  • 35
  • 36

3 Answers3

10

here's another, statistically sound good way: http://www.thebroth.com/blog/118/bayesian-rating

longneck
  • 11,938
  • 2
  • 36
  • 44
  • 2
    To complement this, there's this option as well, that's a bit more intense: http://www.evanmiller.org/how-not-to-sort-by-average-rating.html Bayesian rating is probably much better though. It's an interesting other approach though. – brianreavis Sep 16 '09 at 15:18
  • This solution is good, but it has the disadvantage that you need to know the average number of votes and ratings! That means more[!] MySQL queries for each rating calculation. – tuergeist Sep 16 '09 at 15:27
  • that evanmiller.org page is the one i was actually looking for as that is also an excellent algorithm. i couldn't look it up at work because for some reason it's blocked by the content filter. – longneck Sep 16 '09 at 15:33
3

The question is, how much higher the 4.6/20 shall be rated than the 5.0/2...

An idea not to take items in consideration that do not have at least x votes.

Another idea is to fill up with "medium" votes. Decide that 10votes shall be the minimum. The 5.0/2 must be filled with 8 virtual votes of 2.5

5.0/2 means 2 votes with 5.0, add 8 with 2.5 you'll get 30/10 -> 3.0 ;)

Now, you have to decide how many votes an item shall at least have. For those that already have the minimum votes, a direct comparation shall be done.

4.5/20 > 4.4/100
5.0/2  < 3.1/20  (as 5.0/2 is, as we calculated, 3.0/10)
tuergeist
  • 9,171
  • 3
  • 37
  • 58
2

How about you give each 10 votes a weight of 1 so 20 votes gives the item 2 weight. Then if the item has 0 weight it will loose 0.5 from the average

4.6/20 = 20/10: 2 weight
5.0/2 = 2/10: 0 weight

(4.6 * 0.02) + 4.6 = 4.692
(5.0 * 0.00) + 5.0 = 5 - 0.5 = 4.5

2.5/100 = 100/10: 10 weight
4.5/2 = 2/10: 0 weight

(2.5 * 0.1) + 2.5 = 2.75
(4.5 * 0.0) + 4.5 = 4.5 - 0.5 = 4
andho
  • 1,166
  • 1
  • 15
  • 27