2

I've got a sparse power set for an input (ie some combos have been pre-excluded). Each entry in the power set has a certain score. I want to find the combination that covers all points and maximizes the overall score.

For example, let's say the input is generated as follows:

function powerset(ary) {
  var ps = [[]];
  for (var i = 0; i < ary.length; i++) {
    for (var j = 0, len = ps.length; j < len; j++) {
      ps.push(ps[j].concat(ary[i]));
    }
  }
  return ps;
}

function generateScores() {
  var sets = powerset([0, 1, 2, 3]);
  sets.pop() //remove the last entry to make it "sparse"
  var scores = {};
  for (var i = 1; i < sets.length; i++) { //skip 0-len
    var set = sets[i];
    var val = 0;
    for (var j = 0; j < set.length; j++) {
      val |= (1 << set[j]);
    }
    scores[val] = ~~Math.pow(((Math.random()+1)*4),set.length);
  }
  return scores;
}
var scores = generateScores();

And the output would look like this:

{
  "1": 7,
  "2": 4,
  "3": 36,
  "4": 5,
  "5": 32,
  "6": 50,
  "7": 84,
  "8": 4,
  "9": 30,
  "10": 50,
  "11": 510,
  "12": 47,
  "13": 73,
  "14": 344,
}

Since order doesn't matter, I can convert the combinations into a bitmask & use that as the key. So to read the table: a key of "3" is 011 is base 2, which means linking 0-1 yields a score of 36, whereas 0 individually + 1 individually yields a total sum of 11, therefore the linkage, 0-1, is greater than the sum of its parts 0,1.

In doing so, I've reduced this to a weighted subset sum problem, where the goal is to find every combination that sums to 15 (the equivalent of 1111 in base 2) & then take the max. This is where I'm stuck. I tried using dynamic programming, but due to the randomness, I don't see how I can make any reductions. For example, 1-2 may be better than 1,2 (in the above table, "3" has a higher score than "1" + "2"). However 1-3,2 could be better than 1-2,3 or 1-2-3).

How might I efficiently find the optimal mix? (brute force isn't feasible). For this example, the solution would be "11" + "4", for a total of 515.

Matt K
  • 4,813
  • 4
  • 22
  • 35
  • 1
    maybe some more examples make the problem more clear. why, for example, should `3` return `36`? what i am missing here? – Nina Scholz Sep 09 '15 at 18:29
  • Ah, sorry about that. that's from the above table, which is provided by the input. The table states: `"3": 36,`. The `3` is base 2 for `0011`, which means the first and 2nd values are "set" or "linked". Since the input values are `[0,1,2,3]`, that means `0` and `1` are linked (denoted `0-1`). That entire table is supplied as input. Let me know if that clears it up! – Matt K Sep 09 '15 at 18:42
  • sorry, where is `f(3) = 36`? – Nina Scholz Sep 09 '15 at 19:31
  • 2nd code block, the unnamed object – Matt K Sep 09 '15 at 20:16
  • i have seen the block, but what makes me feel helpless is the point, that random is involved. the interval of every value is between 4 and 8 to the power of the length of the powerset. so with 3 we get a value between `4^Math.ceil(log2(3))` and `8^Math.ceil(log2(3))`. in numbers it is between 16 and 64. – Nina Scholz Sep 09 '15 at 20:50
  • Yep, the randomness of the scores is what has me stuck. Sometimes keeping any two given vertices separated is better, often times not. The *general* trend is that options involving more vertices have a higher score, but since this is not always the case, I can't rely on that, otherwise I could find the solution by picking the options with the most edges. Hope that makes sense! – Matt K Sep 09 '15 at 21:00

3 Answers3

1

You want to find the combination of elements that sum to 15 and don't have any overlapping bits, maximizing the score of the selected elements.

To do this, define a function bestSubset(use, valid) that inputs a set of elements it's required to use and a subset of elements that are valid to be included but have not yet been considered. It operates recursively by considering an element s in the valid set, considering either the case where s is used or when it is not used (if it is used then any elements that overlap bits can no longer be used).

Here's a javascript implementation:

var scores = {1:7, 2:4, 3:36, 4:5, 5:32, 6:50, 7:84, 8:4, 9:30, 10:50, 11:510, 12:47, 13:73, 14:344};
var S = [];
for (var prop in scores) {
  S.push([parseInt(prop), scores[prop]]);
}

var n = 15;  // Target sum
var k = S.length;  // Number of weights

function bestSubset(use, valid) {
  if (valid.length == 0) {
    var weightSum = 0;
    var scoreSum = 0;
    var weights = [];
    for (var ct=0; ct < use.length; ct++) {
      weightSum += S[use[ct]][0];
      weights.push(S[use[ct]][0]);
      scoreSum += S[use[ct]][1];
    }
    if (weightSum == n) {
      return [weights, scoreSum];
    } else {
      return false;
    }
  }

  // Don't use valid[0]
  var valid1 = [];
  for (ct=1; ct < valid.length; ct++) {
    valid1.push(valid[ct]);
  }
  var opt1 = bestSubset(use, valid1);

  // Use valid[0]
  var use2 = JSON.parse(JSON.stringify(use));
  use2.push(valid[0]);
  var valid2 = [];
  for (ct=1; ct < valid.length; ct++) {
    if ((S[valid[0]][0] & S[valid[ct]][0]) == 0) {
      valid2.push(valid[ct]);
    }
  }
  var opt2 = bestSubset(use2, valid2);

  if (opt1 === false) {
    return opt2;
  } else if (opt2 === false || opt1[1] >= opt2[1]) {
    return opt1;
  } else {
    return opt2;
  }
}

var initValid = [];
for (var ct=0; ct < S.length; ct++) {
  initValid.push(ct);
}
alert(JSON.stringify(bestSubset([], initValid)));

This returns the set [4, 11] with score 515, as you identified in your original post.

From some computational experiments in the non-sparse case (aka with d digits and target (2^d)-1, include all numbers 1, 2, ..., (2^d)-1), I found that this runs exponentially in the number of digits (the number of times it checks validity at the top of the recursive function is O(e^(1.47d))). This is much faster than the brute force case in which you separately consider including or not including each of the numbers 1, 2, ..., (2^d)-1, which runs in doubly exponential runtime -- O(2^2^d).

josliber
  • 43,891
  • 12
  • 98
  • 133
  • Still trying to wrap my head around this, but I think this solves all use cases (aside from exceeding the recursion depth of 1000, but I'm assuming anything where k > 1000 would be time prohibitive anyways). Many thanks! If you have any reading suggestions please feel free to post them, after spinning my wheels for 3 hours I'm fairly certain I never would have arrived at this answer. – Matt K Sep 09 '15 at 21:41
  • 1
    @MattK I'm not sure I have a specific suggestion, though I suppose an algorithms textbook would cover these sorts of things. – josliber Sep 09 '15 at 22:06
0

A different approach (as always):

First thought:

You can get a weight for every value whose sum is smaller than a single weight. Therefore wa + wb < wc and a + b = c, which leads to a simple weight system.

Second thought:

For better understanding weights, it must be natural numbers aka integers.

Third thought:

Why not just use the numbers itself with a small reduction to make sums smaller than a single weight.

Together:

I take the numbers and take the value as weight. Additionally, I reduce their value by 1 so:

a = 1, b = 2, c = 3 wa + wb < wc
wa = 0, wb = 1, wc = 2 => 0 + 1 < 2

The formula: weightn = n - 1

Proof:

For every summand, you get a malus of -1. So, for more summands, you get a smaller number than the weight of the original number.

Another Example:

The weight15 (14) should be greater than the sum of weight4 (3) and weight11 (10).

In numbers: 14 > 3 + 10

I mean, no program code is required here.

Emma
  • 27,428
  • 11
  • 44
  • 69
Nina Scholz
  • 376,160
  • 25
  • 347
  • 392
  • I believe this is essentially what the other answer does. For each weight, you see if it equals the desired sum. If it doesn't, recurse & find a combo that does, thus using 2 summands, 3 summands, etc. – Matt K Sep 10 '15 at 12:53
0

For those googling this, I used the answer provided from @josilber without recursion & with overlap protection (see below). Since recursion depth in JS is limited to 1000, I had to use loops. Unfortunately for my use case, I'm still running out of memory, so it looks like I have to use some heuristic.

var scores = {1: 7, 2: 4, 3: 36, 4: 5, 5: 32, 6: 50, 7: 84, 8: 4, 9: 30, 10: 50, 11: 510, 12: 47, 13: 73, 14: 344};
var S = [];
var keys = Object.keys(scores);
for (i = 0; i < keys.length; i++) {
  S.push([parseInt(keys[i]), scores[keys[i]]]);
}

var n = Math.pow(2,range.length) -1;  // Target sum
var k = S.length;  // Number of weights

// best[i, j] is scored in position i*(k+1) + j
var best = [];

// Base case
for (var j = 0; j <= k; j++) {
  best.push([[], 0]);
}

// Main loop
for (var i = 1; i <= n; i++) { 
  best.push(false);  // j=0 case infeasible
  for (j = 1; j <= k; j++) {
    var opt1 = best[i * (k + 1) + j - 1];
    var opt2 = false;
    if (S[j - 1][0] <= i) {
      var parent = best[(i - S[j - 1][0]) * (k + 1) + j - 1];
      if (parent !== false) {
        opt2 = [parent[0].slice(), parent[1]];
        var child = S[j - 1];
        var opt2BitSig = 0;
        for (var m = 0; m < opt2[0].length; m++) {
          opt2BitSig |= opt2[0][m];
        }
        if ((opt2BitSig & child[0])) {
          opt2 = false;
        } else {
          opt2[0].push(child[0]);
          opt2[1] += child[1];
        }
      }
    }
    if (opt1 === false) {
      best.push(opt2);
    } else if (opt2 === false || opt1[1] >= opt2[1]) {
      best.push(opt1);
    } else {
      best.push(opt2);
    }
  }
}

console.log(JSON.stringify(best[n * (k + 1) + k]));
Matt K
  • 4,813
  • 4
  • 22
  • 35