4

The problem:

I have items that have weights. The higher the weight, the greater chance they have the item will go first. I need to have a clean, simple way of doing this that is based on core Java (no third party libraries, jars, etc.).

I've done this for 2 items, by summing the weights then randomly pick a number using Math.random() within that range. Very simple. But for items greater than 2, I can either do more samples in the same range chancing misses, or I can recompute the sum of the weights of the remaining items and select again (recursive approach). I think that there might be something out there can can do this faster/cleaner. This code will be used over and over, so I'm looking for an effective solution.

In essence, its like randomized weight permutations.

Some Examples:

  1. A has weight of 1, B has weight of 99. If I ran the simulation with this, I would expect to get BA 99% of the time and AB 1% of the time.

  2. A has the weight of 10, B has the weight of 10, and C has the weight of 80. If I ran simulations with this, I would expect C to be the first item in the ordering 80% of the time, in those cases, A and B would have an equal chance of being the next character.

Extra Details:

For my particular problem, there is a small number of items with potentially large weights. Say 20 to 50 items with weights that are stored in the form of a long, where the minimum weight is at least a 1000. The number of items may increase quite a bit too, so if we can find a solution that doesn't require the items to be small, that would be preferred.

halfer
  • 19,824
  • 17
  • 99
  • 186
James Oravec
  • 19,579
  • 27
  • 94
  • 160
  • 1
    If you're hoping to improve an existing solution then maybe you could post at http://codereview.stackexchange.com/? – PakkuDon May 31 '14 at 14:49
  • No code has been written for the cases where I have more than 2 items. Just figured out a couple solutions that "could" work. I'd like to see if anyone has experience that makes this problem very simple. If not, I'll probably go with the recursive solution... – James Oravec May 31 '14 at 14:55
  • 1
    Is weight limited between 1 and 100 (or in other way)? – KonradOliwer May 31 '14 at 15:45
  • For my case, there is a small number of items with potentially large weights. Say 20 to 50 items with weights that are stored in the form of a long, where the minimum weight is at least a 1000. – James Oravec Jun 01 '14 at 14:55

5 Answers5

4

You have items with weights:

  • Item A, weight 42
  • Item B, weight 5
  • Item C, weight 96
  • Item D, weight 33

First add up all the weights: 42 + 5 + 96 + 33 = 176

Now pick a random number, r, from 0 up to the sum of the weights: 0 <= r < 176. I have used integers, but you could use reals if required.

Compare r with the ranges defined by the weights:

  • 0 <= r < 42: select item A.
  • 42 <= r < 47 (= 42 + 5): select item B.
  • 47 <= r < 143 (= 47 + 96): select item C.
  • 143 <= r < 176 (= 143 + 33): select item D.

When you have picked the first item, then repeat the process with the three remaining items and a reduced sum of all the weights. Keep repeating until there are no more items to pick.

rossum
  • 15,344
  • 1
  • 24
  • 38
2

This seems to work fine:

// Can do a weighted sort on weighted items.
public interface Weighted {
    int getWeight();
}

/**
 * Weighted sort of an array - orders them at random but the weight of each
 * item makes it more likely to be earlier.
 *
 * @param values
 */
public static void weightedSort(Weighted[] values) {
    // Build a list containing as many of each item to make up the full weight.
    List<Weighted> full = new ArrayList<>();
    for (Weighted v : values) {
        // Add a v weight times.
        for (int i = 0; i < v.getWeight(); i++) {
            full.add(v);
        }
    }
    // Shuffle it.
    Collections.shuffle(full);
    // Roll them out in the order required.
    int i = 0;
    do {
        // Get the first one in the shuffled list.
        Weighted next = full.get(0);
        // Put it back into the array.
        values[i++] = next;
        // Remove all occurrences of that one from the list.
        full.remove(next);
    } while (!full.isEmpty());
}

// A bunch of weighted items.
enum Heavies implements Weighted {

    Rare(1),
    Few(3),
    Common(6);
    final int weight;

    Heavies(int weight) {
        this.weight = weight;
    }

    @Override
    public int getWeight() {
        return weight;
    }
}

public void test() {
    Weighted[] w = Heavies.values();
    for (int i = 0; i < 10; i++) {
        // Sort it weighted.
        weightedSort(w);
        // What did we get.
        System.out.println(Arrays.toString(w));
    }
}

Essentially for each item to be sorted I add it as many times as needed to a new list. I then shuffle the list and pull the top one out and clar all occurrences of it from the remaining.

The last test run produced:

[Rare, Common, Few]
[Common, Rare, Few]
[Few, Common, Rare]
[Common, Few, Rare]
[Common, Rare, Few]
[Few, Rare, Common]

which seems to be about right.

NB - this algorithm will fail under at least the following conditions:

  1. The original array has the same object in it more than once.
  2. The weights of the items are insanely huge.
  3. Zero or negative weights will almost certainly mess with the results.

Added

This implements Rossum's idea - please be sure to give him the credit for the algorithm.

public static void weightedSort2(Weighted[] values) {
    // Calculate the total weight.
    int total = 0;
    for (Weighted v : values) {
        total += v.getWeight();
    }
    // Start with all of them.
    List<Weighted> remaining = new ArrayList(Arrays.asList(values));
    // Take each at random - weighted by it's weight.
    int which = 0;
    do {
        // Pick a random point.
        int random = (int) (Math.random() * total);
        // Pick one from the list.
        Weighted picked = null;
        int pos = 0;
        for (Weighted v : remaining) {
            // Pick this ne?
            if (pos + v.getWeight() > random) {
                picked = v;
                break;
            }
            // Move forward by that much.
            pos += v.getWeight();
        }
        // Removed picked from the remaining.
        remaining.remove(picked);
        // Reduce total.
        total -= picked.getWeight();
        // Record picked.
        values[which++] = picked;
    } while (!remaining.isEmpty());
}
OldCurmudgeon
  • 64,482
  • 16
  • 119
  • 213
  • This is very nice, its along the lines of what I was thinking. The problem I run into is for my application starts with weights of 1000 and only get bigger (weights are stored in a long), so creating a priority for each weight or "token" would be overkill for my intended use. The number of items I'm comparing is small though, so say 20 items with weights. This might helpful into :) – James Oravec Jun 01 '14 at 14:50
  • @VenomFangs - You could take a first-pass through the weights and factor them by dividing by their HCF - could that reduce your numbers? rossum's solution does not have that problem. – OldCurmudgeon Jun 01 '14 at 16:58
  • Thx for the suggestion. I have concerns about the HCF though, as with multiple items, there may not be one greater than 1, especially if there is a prime number in the list. This would create a lot of extra computation too... I'll give it some more time, if I don't get something easier/cleaner, I'll accept rossum's solution... – James Oravec Jun 01 '14 at 18:10
  • Dunno who gave me -1, but if you use flags for marking items on place instead of new ArrayList(Arrays.asList(values)) and remaining.remove(picked), it will be more effective. Memory management and GC are not for free. – ggurov Jun 04 '14 at 15:21
0
public class RandomPriorityQueue {

    private TreeMap<Integer, List<WeightedElement>> tree = new TreeMap();
    private Random random = new Random();

    public void add(WeightedElement e) {
        int priority = random.nextInt(e.getWeight());
        if (tree.containsKey(priority)) {
            List<WeightedElement> list = new LinkedList();
            list.add(e);
            tree.put(priority, list);
        } else {
            List<WeightedElement> list = tree.get(priority);
            list.add(random.nextInt(list.size()), e);
        }
    }

    public WeightedElement poll() {
        Map.Entry<Integer, List<WeightedElement>> entry = tree.lastEntry();
        if (entry == null){
            return null;
        }
        List<WeightedElement> list = entry.getValue();
        if (list.size() == 1){
            tree.remove(entry.getKey());
        }
        return list.remove(0);
    }
}

Of course we would have betterperformence if we would rewrite TreeMap so it allow us to add duplication keys, we would have better performance.

KonradOliwer
  • 166
  • 6
  • Your solution looks to be a priority queue that uses random numbers. Based on that, it would not solve the problem specified. – James Oravec Jun 01 '14 at 14:46
  • In which point it actualy doesn't solve your porblem. It satify your data sample. Is it a problem that it is queue, or is it something else? If that's the problem perhaps you could specify what do you want to do, when you'll have this ordered set of elements. – KonradOliwer Jun 01 '14 at 15:41
  • A couple of things are wrong, which you might be able to fix... 1) Your line `if (tree.containsKey(priority))` needs a `!` at the beginning. 2) After you do that then you can run your code many times. Use `A` of weight 1000 and `B` of weight 9000. You'd expect 10% as the victory for "A" but when you run large simulations (say of size 10000000) you'll see you get a result close to 5.57% instead of the expected 10%. This stems from how you create the random priorities, you might be able to fix this, if you can I'd love to see the update, and might use it. :) – James Oravec Jun 01 '14 at 16:24
0

I have found a solution on another answer - cannot find it right now, but it uses exponential distribution:

To an i-th element with weight w_i, assign a key power(random(0,1),1.0/w_i) (in pseudocode) and then sort the elements by keys. This takes O(n*log(n)) time, complexity independent of the actual weights.

Ferazhu
  • 89
  • 2
  • 7
-1

Anyway, for N items you will need N-1 random numbers (at minimum). Then, lets think about effective way to choose item by random number.

If the items are not too much, I would use iterative method, similar to your recursive approach. I would add boolean flag to the items, for skipping chosen ones in the previous iterations. When I choose one in the current iteration, I will set its flag to true and the next times I will skip it from calculations. Subtract its weight from the sum and go for next iteration.

If the items are big number and one and the same set will be used many times, then different approach is better. Make sorted list of them and use a copy of this list in your recursive approach. And in every recursion step - binary search in it, then removing the chosen item.

Actually, the last can be done iteratively too.

ggurov
  • 1,496
  • 2
  • 15
  • 21