0

I have a hashmap with string to integer mappings, and I'm trying to select a random entry from it, weighted on the integer associated with each. The code I'm currently using generates the cumulative sums for all the entries. However, this is incredibly slow and inefficient when you need to do such and operation many times. Any ideas as to how I should do this?

My code, which is in terrible need of optimization:

public String getRandom(Random rnd) {

    ArrayList<Entry<String, MutableInteger>> entries = new ArrayList<>(results.entrySet());

    ArrayList<Long> ints = new ArrayList<>();
    ArrayList<String> vals = new ArrayList<>();

    long cumulative = -1;

    for(Entry<String, MutableInteger> e : entries) {
        ints.add(cumulative += e.getValue().get());
        vals.add(e.getKey());
    }

    long l = Math.abs(rnd.nextLong()) % (cumulative == 0 ? 1 : cumulative);
    int index = Collections.binarySearch(ints, l);
    index = (index >= 0) ? index : -index-1;

    return vals.get(index);
}
takra
  • 457
  • 5
  • 15
  • First thing: why calculate the cum-sum each time you are sampling. Do it once and maybe after modifying your data (maybe an class abstracting this would be a good idea; always keeps valid data-structures for sampling ready). Second thing: If the sampling-method is too slow, use something more advanced like the **alias-method**. – sascha Oct 02 '16 at 18:08
  • See also http://stackoverflow.com/questions/20863638/weighted-sampling-with-replacement-in-java – kennytm Oct 02 '16 at 18:21

0 Answers0