4

Redis has a data structure called a sorted set.

The interface is roughly that of a SortedMap, but sorted by value rather than key. I could almost make do with a SortedSet, but they seem to assume static sort values.

Is there a canonical Java implementation of a similar concept?

My immediate use case is to build a set with a TTL on each element. The value of the map would be the expiration time, and I'd periodically prune expired elements. I'd also be able to bump the expiration time periodically.

Zoyd
  • 3,449
  • 1
  • 18
  • 27
ABentSpoon
  • 5,007
  • 1
  • 26
  • 23
  • Have a look at java.util.PriorityQueue. It's not a set, but it does keep elements arranged in order and makes pruning easy. – Stuart Marks Apr 30 '14 at 04:25

3 Answers3

1

So... several things.

First, decide which kind of access you'll be doing more of. If you'll be doing more HashMap actions (get, put) than accessing a sorted list, then you're better off just using a HashMap and sorting the values when you want to prune the collection.

As for pruning the collection, it sounds like you want to just remove values that have a time less than some timestamp rather than removing the earliest n items. If that's the case then you're better off just filtering the HashMap based on whether the value meets a condition. That's probably faster than trying to sort the list first and then remove old entries.

Chris Gerken
  • 16,221
  • 6
  • 44
  • 59
1

Since you need two separate conditions, one on the keys and the other one on the values, it is likely that the best performance on very large amounts of data will require two data structures. You could rely on a regular Set and, separately, insert the same objects in PriorityQueue ordered by TTL. Bumping the TTL could be done by writing in a field of the object that contains an additional TTL; then, when you remove the next object, you check if there is an additional TTL, and if so, you put it back with this new TTL and additional TTL = 0 [I suggest this because the cost of removal from a PriorityQueue is O(n)]. This would yield O(log n) time for removal of the next object (+ cost due to the bumped TTLs, this will depend on how often it happens) and insertion, and O(1) or O(log n) time for bumping a TTL, depending on the implementation of Set that you choose.

Of course, the cleanest approach would be to design a new class encapsulating all this.

Also, all of this is overkill if your data set is not very large.

Zoyd
  • 3,449
  • 1
  • 18
  • 27
1

You can implement it using a combination of two data structures. A sorted mapping of keys to scores. And a sorted reverse mapping of scores to keys.

In Java, typically these would be implemented with TreeMap (if we are sticking to the standard Collections Framework).

Redis uses Skip-Lists for maintaining the ordering, but Skip-Lists and Balanced Binary Search Trees (such as TreeMap) both serve the purpose to provide average O(log(N)) access here.

For a given sort set, we can implement it as an independent class as follows:

class SortedSet {
  TreeMap<String, Integer>> keyToScore;
  TreeMap<Integer, Set<String>>> scoreToKey

  public SortedSet() {
    keyToScore= new TreeMap<>();
    scoreToKey= new TreeMap<>();
  }

  void addItem(String key, int score) {
    if (keyToScore.contains(key)) {
      // Remove old key and old score
    }
    // Add key and score to both maps 

  }

  List<String> getKeysInRange(int startScore, int endScore) {
     // traverse scoreToKey and retrieve all values
  }

  ....

}

Saurabh Maurya
  • 870
  • 7
  • 12