3

Consider that 3 different game/movies got

A: 9.1 in 8000 votes and another got

B: 9.3 in 500 votes, and yet another got

C: 9.5 in 60 votes

What is the best formula to normalize them for comparison. i.e I want to predict what could be the rating of C & B if 8000 votes were cast for them so as to compare with A.

Is there an online calculator for the same? Also I don't have access as how each individual rating them.

George
  • 427
  • 3
  • 7
  • 13

2 Answers2

1

You can simply calculate proportion of good votes,

but it's better to add a correction for total number of votes given.

One way to correct is to add "dummy" bad votes (e.g. 20), so

Items with a large number of votes see their modified percentage alters very little from their real percentage, but items with relatively few votes will see their modified percentage move considerably toward low values.

This is known as "Bayesian averaging". In effect, the item with many votes will rank higher than items with the same percentage but fewer votes.

Community
  • 1
  • 1
Tomaso Neri
  • 486
  • 4
  • 8
0

They are averages, and thus are already "normalized". Unless you have some reason to expect that adding another 7,500 votes to B would change the average, then the number of votes cast does not change the average rating.

If you had some other data such as "ratings drop by 10% after the first X votes" then you could possibly extrapolate, but as it is there's no reason to expect that the average would change with more votes.

D Stanley
  • 149,601
  • 11
  • 178
  • 240
  • It's just the average and he wants to take it a step further and make the number of votes play a bigger role in affecting the total score. I have the same question but can't seem to find a good way to do it. I have two data points X: Number of Games Played (lowest: 1 highest: 1000) Y: Average of game scores[sum of game scores / num of games played] (lowest: 4 highest: 6.5) I don't have record of the individual game scores, I only have the average. How can i generate a score based on these two variables, in a sense that when [x=1,y=6.5] is worse[has a lower score] than [x=5, y=5.5] – DarkMental Apr 27 '18 at 03:37