2

I am trying to build leaderboards in Redis and be able to get top X scores and retrieve a rank of user Y.

Sorted lists in Redis look like an easy fit except for one problem - I need scores to be sorted not only by actual score, but also by date (so whoever got the same score earlier will be on top). SQL query would be:

select * from scores order by score desc, date asc

Running zrevrange on a sorted set in Redis uses something like:

select * from scores order by score desc, key desc

Which would put users with lexicographically bigger keys above.

One solution I can think of is making some manipulations with a score field inside a sorted set to produce a combined number that consists of a score and a timestamp.

For example for a score 555 with a timestamp 111222333 the final score could be something like 555.111222333 which would put newer scores above older ones (not exactly what I need but could be adjusted further).

This would work, but only on small numbers, as a score in a sorted set has only 16 significant digits, so 10 of them will be wasted on a timestamp right away leaving not much room for an actual score.

Any ideas how to make a sorted set arrange values in a correct order? I would really want an end result to be a sorted set (to easily retrieve user's rank), even if it requires some temporary structures and sorts to build such set.

Perception
  • 79,279
  • 19
  • 185
  • 195
serg
  • 109,619
  • 77
  • 317
  • 330

4 Answers4

1

Actually, all my previous answers are terrible. Disregard all my previous answers (although I'm going to leave them around for the benefit of others).

This is how you should actually do it:

  • Store only the scores in the zset
  • Separately store a list of each time a player achieved that score.

For example:

score_key = <whatever unique key you want to use for this score>
redis('ZADD scores-sorted %s %s' %(score, score))
redis('RPUSH score-%s %s' %(score, score_key))

Then to read the scores:

top_score_keys = []
for score in redis('ZRANGE scores-sorted 0 10'):
    for score_key in redis('LRANGE score-%s 0 -1' %(score, )):
        top_score_keys.append(score_key)

Obviously you'd want to do some optimizations there (ex, only reading hunks of the score- list, instead of reading the entire thing).

But this is definitely the way to do it.

User rank would be straight forward: for each user, keep track of their high score:

redis('SET highscores-%s %s' %(user_id, user_high_score))

Then determine their rank using:

user_high_score = redis('GET highscores-%s' %(user_id, ))
score_rank = int(redis('ZSCORE scores-sorted %s' %(user_high_score, )))
score_rank += int(redis('LINDEX score-%s' %(user_high_score, )))
David Wolever
  • 148,955
  • 89
  • 346
  • 502
  • Sorry I don't see how this creates secondary order by date. How to get top 10 scores ordered by `score desc, date asc`? How to get rank of a specific user? – serg May 16 '12 at 00:28
  • See the second example, which shows how you'd get the top 10 scores ordered by score (by the `ZRANGE`) and by date, descending (from the `LRANGE`, because the list is ordered oldest-to-newest. Oh, wait, but you want newest-to-oldest — in that case, change the `RPUSH` to `LPUSH` and the list will be ordered newest-to-oldest. – David Wolever May 16 '12 at 01:00
  • And see my edit which shows how a user's rank would be calculated. – David Wolever May 16 '12 at 01:06
  • I added 1 question http://stackoverflow.com/questions/12854149/redis-suggesstion-for-selecting-data-type i think it's related to this post. But i am not quit understand your ans. Is this works with time also. – PHP Connect Oct 12 '12 at 09:15
0

It's not really the perfect solution, but if you make a custom epoch that would be closer to the current time, then you would need less digits to represent it.

For instance if you use January 1, 2012 for your epoch you would (currently) only need 8 digits to represent the timestamp.

Here's an example in ruby:

(Time.new(2012,01,01,0,0,0)-Time.now).to_i

This would give you about 3 years before the timestamp would require 9 digits, at which time you could perform some maintenance to move the custom epoch forward again.

I would however love to hear if anyone has a better idea, since I have the excact same problem.

Thomas Dippel
  • 587
  • 1
  • 5
  • 13
0

(Note: this answer is almost certainly suboptimial; see https://stackoverflow.com/a/10575370/71522)

A couple thoughts:

  • You could make some assumptions about the timestamps to make them smaller. For example, instead of storing Unix timestamps, you could store "number of minutes since May 13, 2012" (for example). In exchange for seven significant digits, this would let you store times for the next 19 years.
  • Similarly, you could reduce the number of significant digits in the scores. For example, if you expect scores to be in the 7-digit range, you could divide them by 10, 100, or 1000 when storing them in the sorted list, then use the results of the sorted list to access the actual scores, sorting those at the application level.

For example, using both of the above (in potentially buggy pseudo-code):

score_small = int(score / 1000)
time_small = int((time - 1336942269) / 60)
score_key = uuid()
redis('SET full-score-%s "%s %s"' %(score_key, score, time))
redis('ZADD sorted-scores %s.%s %s' %(score_small, time_small, score_key))

Then to load them (approximately):

top_scores = []
for score_key in redis('ZRANGE sorted-scores 0 10'):
    score_str, time_str = redis('GET full-score-%s' %(score_key, )).split(" ")
    top_scores.append((int(score_str), int(time_str))
top_scores.sort()

This operation could even be done entirely inside Redis (avoid the network overhead of the O(n) GET operations) using the EVAL command (although I don't know enough Lua to confidently provide an example implementation).

Finally, if you expect a truly huge range of scores (for example, you expect that there will be a large number of scores below 10,000 and an equally large number of scores over 1,000,000), then you could use two sorted sets: scores-below-100000 and scores-above-100000.

Community
  • 1
  • 1
David Wolever
  • 148,955
  • 89
  • 346
  • 502
0

(Note: this answer is almost certainly suboptimial; see https://stackoverflow.com/a/10575370/71522)

Instead of using a timestamp in the score, you could use a global counter. For example:

score_key = <whatever unique key you want to use for this score>
score_number = redis('INCR global-score-counter')
redis('ZADD sorted-scores %s.%s %s' %(score, score_number, score_key)

And to sort them in descending order, pick a large score count (1<<24, say), use that as the initial value of global-score-counter, then use DECR instead of INCR.

(this would also apply if you are using a timestamp)

Alternately, if you really incredibly worried about the number of players, you could use a per-score counter:

score_key = <whatever unique key you want to use for this score>
score_number = redis('HINCR score-counter %s' %(score, ))
redis('ZADD sorted-scores %s.%s %s' %(score, score_number, score_key))
Community
  • 1
  • 1
David Wolever
  • 148,955
  • 89
  • 346
  • 502