1

Another way of asking this is: can we use relative rankings from separate data sets to produce a global rank?

Say I have a variety of data sets with their own rankings based upon the criteria of cuteness for baby animals: 1) Kittens, 2) Puppies, 3) Sloths, and 4) Elephants. I used pairwise comparisons (i.e., showing people two random pictures of the animal and asking them to select the cutest one) to obtain these rankings. I also have the full amount of comparisons within data sets (i.e., all puppies were compared with each other in the puppy data set).

I'm now trying to merge the data sets together to produce a global ranking of the cutest animal.

The main issue of relative ranking is that the cutest animal in one set may not necessarily be the cutest in the other set. For example, let's say that baby elephants are considered to be less than attractive, and so, the least cutest kitten will always beat the cutest elephant. How should I get around this problem?

I am thinking of doing a few cross comparisons across data sets (Kittens vs Elephants, Puppies vs Kittens, etc) to create some sort of base importance, but this may become problematic as I add on the number of animals and the type of animals.

I was also thinking of looking further into filling in sparse matrices, but I think this is only applicable towards one data set as opposed to comparing across multiple data sets?

truckbot
  • 83
  • 1
  • 8

1 Answers1

1

You can achieve your task using a rating system, like most known Elo, Glicko, or our rankade. A rating system allows to build a ranking starting from pairwise comparisons, and

  • you don't need to do all comparisons, neither have all animals be involved in the same number of comparisons,
  • you don't need to do comparison inside specific data set only (let all animals 'play' against all other animals, then if you need ranking for one dataset, just use global ranking ignoring animals from others).

Using rankade (here's a comparison with aforementioned ranking systems and Microsoft's TrueSkill) you can record outputs for 2+ items as well, while with Elo or Glicko you don't. It's extremely messy and difficult for people to rank many items, but a small multiple comparison (e.g. 3-5 animals) should be suitable and useful, in your work.

Tomaso Neri
  • 486
  • 4
  • 8
  • 1
    Thanks for the suggestion, and cool website! (I might check that out the next time my friends and I play a tournament.) I could just do a global ranking from the beginning, and then to obtain a local ranking, just eliminate the other animals. One potential problem is scaling though - as you mentioned, this would be fine for 3-5 animals, but I plan on adding in hundreds of animals. – truckbot Oct 09 '16 at 16:14
  • 1
    Just a note: I suggested 3-5 (or even more) animals for **each comparison** (i.e. *match*), performing a multiple-items comparison instead of 'classic' two-items one, but there's no such limit for the number of animals in a group. – Tomaso Neri Oct 09 '16 at 17:36
  • Oh, I see. Thank you for the clarification. So it would be something like a tri-wise or quintuple-wise comparison. I suppose this method would not handle ties very well, but it would be a good method for scaling. – truckbot Oct 09 '16 at 18:53
  • 1
    You can build a match with 2 to 30 factions (animals, in your case), and define the final order as per your needs, including 'custom' ties (for instance, in a four-faction match, you can have a winner, two factions tied for second place and one faction in the fourth place). – Tomaso Neri Oct 09 '16 at 19:54
  • As clarification, is the following the type of match you're talking about? Example: A match consists of 5 different baby animal pictures. The end user selects the winner and the second place through fifth place (or ties) for this match. Thanks for continuing to answer my questions! – truckbot Oct 10 '16 at 04:06
  • You're welcome! And yes, you can insert 5-animals matches as per your example. – Tomaso Neri Oct 10 '16 at 08:58