4

If I have an array of numbers and a list of sums that total the array elements, what's the most effective approach (or at least a not a brute force hack) for determining which of the elements are included in the sum?

A simplified example might look like:

array = [6, 5, 7, 8, 6, 12, 16] sums = [14, 24, 22]

and I would want to know:

14 includes 8, 6

24 includes 5, 7, 12

22 includes 6, 16

function matchElements(arr, sums) {
    var testArr;
    function getSumHash() {
        var hash = {},
            i;
        for (i = 0; i < sums.length; i++) {
            hash[sums[i]] = [];
        }
        return hash;
    }
    sums = getSumHash();
    // I don't have a good sense of where to start on what goes here...
    return sumHash;
}

var totals = matchElements([6, 5, 7, 8, 6, 12, 16], [14,24,22]),
    total;

for (total in totals) {
   console.log(total + "includes", totals[total])
}

http://jsfiddle.net/tTMvP/

I do know that there will always be at least one correct answer and it only matters to me that the numbers check out, I do not need to pair the index where there are duplicates, only the value as it relates to the total. Is there an established function for resolving this kind of problem?

This is only a javascript question because that's the language I'm writing the solution in, this is more of a general mathematics related question as filtered through Javascript. If this is not the appropriate forum, I welcome redirection to the appropriate stack exchange site.

Med
  • 176
  • 7
Shane
  • 4,921
  • 5
  • 37
  • 53
  • While I could use "brute force" as you put it, I don't know of any "fast" algorithm to achieve it. – Xotic750 Feb 21 '14 at 03:51
  • yeah, that's what I was thinking. I said that because I remember some jab in an xkcd cartoon where he told the waiter not to use brute force to calculate his tip when he used some similar question (I think), not the best reason haha... Anything that could eliminate matched elements would be preferred, but I'd settle for brute force, I was just hoping to avoid it. – Shane Feb 21 '14 at 03:55
  • http://www.roryhart.net/code/xkcd-np-complete-restaurant-order/ Here's someone who solved the problem, unfortunately I don't know anything about Minizinc, apparently it's for modeling problems in "Constraint Programming". So it's interesting, but not helpful. – Shane Feb 21 '14 at 04:02

3 Answers3

2

Ok, curse me, this is my knock-up, improvements welcome :)

I believe this is a Bin Packing Problem or knapsack problem

Javascript

General Power Set function

function powerSet(array) {
    var lastElement,
        sets;

    if (!array.length) {
        sets = [[]];
    } else {
        lastElement = array.pop();
        sets = powerSet(array).reduce(function (previous, element) {
            previous.push(element);
            element = element.slice();
            element.push(lastElement);
            previous.push(element);

            return previous;
        }, []);
    }

    return sets;
}

Reduces copies in the power set, ie we dont want [6, 8] and [8, 6] they are the same

function reducer1(set) {
    set.sort(function (a, b) {
        return a - b;
    });

    return this[set] ? false : (this[set] = true);
}

The main function, gets a match for the bin, remove the used items, rinse and repeat

function calc(bins, items) {
    var result = {
            unfilled: bins.slice(),
            unused: items.slice()
        },
        match,
        bin,
        index;

    function reducer2(prev, set) {
        if (!prev) {
            set.length && set.reduce(function (acc, cur) {
                acc += cur;

                return acc;
            }, 0) === bin && (prev = set);
        }

        return prev;
    }

    function remove(item) {
        result.unused.splice(result.unused.indexOf(item), 1);
    }

    for (index = result.unfilled.length - 1; index >= 0; index -= 1) {
        bin = result.unfilled[index];
        match = powerSet(result.unused.slice()).filter(reducer1, {}).reduce(reducer2, '');
        if (match) {
            result[bin] = match;
            match.forEach(remove);
            result.unfilled.splice(result.unfilled.lastIndexOf(bin), 1);
        }
    }

    return result;
}

These are our items and the bins they need to be packed into

var array = [6, 5, 7, 8, 6, 12, 16],
    sums = [14, 24, 22];

console.log(JSON.stringify(calc(sums, array)));

Output

{"14":[6,8],"22":[6,16],"24":[5,7,12],"unfilled":[],"unused":[]} 

On jsfiddle

Xotic750
  • 22,914
  • 8
  • 57
  • 79
  • This deserves way more up-ticks than I can give it. It's a shame maybe 100 people a year will see this. Seems like this is the kind of thing we would have well known libraries for, amazing we don't. Brilliant answer and I hope it helps more than me. – Shane Feb 21 '14 at 13:24
  • Thanks, but it is still very much "brute force". I'm sure someone could do very much better. ;) – Xotic750 Feb 21 '14 at 18:22
  • Since this is "np-complete" as mentioned by Vikram Bhat, the best you can do is organized guessing. The more I looked at this the more I realized I had no choice but to lower expectations. At this point, I just want anything that can produce the outcome as this is officially a non-trivial problem. – Shane Feb 21 '14 at 18:25
2

It might be instructive to show how this could be encoded in a constraint programming system (here MiniZinc).

Here is the complete model. It is also available at http://www.hakank.org/minizinc/matching_sums.mzn

int: n;
int: num_sums;
array[1..n] of int: nums; % the numbers
array[1..num_sums] of int: sums; % the sums

% decision variables

% 0/1 matrix where 1 indicates that number nums[j] is used
% for the sum sums[i]. 
array[1..num_sums, 1..n] of var 0..1: x;

solve satisfy;

% Get the numbers to use for each sum
constraint
   forall(i in 1..num_sums) (
      sum([x[i,j]*nums[j] | j in 1..n]) = sums[i]
   )
;

output 
[
   show(sums[i]) ++ ": " ++ show([nums[j] | j in 1..n where fix(x[i,j])=1]) ++ "\n" 
    | i in 1..num_sums
];

%% Data
n = 6;
num_sums = 3;
nums = [5, 7, 8, 6, 12, 16];
sums = [14, 24, 22];

The matrix "x" is the interesting part, x[i,j] is 1 (true) if the number "nums[j]" is used in the sum of the number "sums[i]".

For this particular problem there are 16 solutions:

....
14: [8, 6]
24: [8, 16]
22: [6, 16]
----------
14: [6, 8]
24: [6, 5, 7, 6]
22: [6, 16]
----------
14: [6, 8]
4: [5, 7, 12]
22: [6, 16]
----------
14: [6, 8]
24: [6, 6, 12]
22: [6, 16]
----------
14: [6, 8]
24: [8, 16]
22: [6, 16]
----------
...

These are not distinct solutions since there are two 6's. With just one 6 there are 2 solutions:

14: [8, 6]
24: [5, 7, 12]
22: [6, 16]
----------
14: [8, 6]
24: [8, 16]
22: [6, 16]
----------

Aside: When I first read the problem I wasn't sure if the objective was to minimize (or maximize) the numbers used. With just some additional variables and constraints the model can be used for that as well. Here is the solution which uses the least count of numbers:

s: {6, 8, 16}
14: [8, 6]
24: [8, 16]
22: [6, 16]
Not used: {5, 7, 12}

And the opposite, the maximum count of numbers used (here all numbers are used since 6 is counted just once in "s"):

s: {5, 6, 7, 8, 12, 16}
14: [8, 6]
24: [5, 7, 12]
22: [6, 16]
Not used: {}

The extended MiniZinc model is available here: http://www.hakank.org/minizinc/matching_sums2.mzn .

(Aside2: A comment mentioned the xkcd restaurant problem. Here is a more general solution for that problem: http://www.hakank.org/minizinc/xkcd.mzn . It's a variant of the current matching problem, the main difference being that a dish can be counted more than once, not just 0..1 as in this matching problem.)

hakank
  • 6,629
  • 1
  • 17
  • 27
  • I was intentionally vague in the original requirement because my focus is pretty narrow and I thought it would be better to ask a broad question to benefit others. I'm actually hacking together a script to match sales to deposits from parsing up a few csv files. Going to recurse through date ranges and find the right combinations to assist in reconciling to make up for some irretrievable records. As you might imagine, guessing manually takes impossible amounts of time. :) – Shane Feb 21 '14 at 18:41
  • 1
    Yes, I noticed the clarification, but I thought that the optimization variant was quite neat. :-) By the way, there is at least one Javascript package that integrates with a constraint programming system: http://code.google.com/p/fdcp/ which use Gecode. However, I haven't tested it myself (it's on my todo list). – hakank Feb 21 '14 at 18:53
  • 1
    incredibly helpful on the library reference... In order not to clutter the comments too badly, I'll just say I'll be studying your answer later as I'm way out of my depth on this one, most grateful. :) – Shane Feb 21 '14 at 19:03
1

The problem of subset sum is np-complete but there is a pseudo polynomial time dynamic programming solution:-

1.calculate the max element of the sums array
2. Solve it using knapsack analogy
3. consider knapsack capacity = sums[max]
4. items as arr[i] with weight and cost same.
5. maximize profit
6. Check whether a sum can be formed from sums using CostMatrix[sums[i]][arr.length-1]==sums[i]

Here is a java implementation of the same:-

public class SubSetSum {
    static int[][] costs;

    public static void calSets(int target,int[] arr) {

        costs = new int[arr.length][target+1];
        for(int j=0;j<=target;j++) {
            if(arr[0]<=j) {

                costs[0][j] = arr[0]; 
            }
        }
        for(int i=1;i<arr.length;i++) {

            for(int j=0;j<=target;j++) {
                costs[i][j] = costs[i-1][j];
                if(arr[i]<=j) {
                    costs[i][j] = Math.max(costs[i][j],costs[i-1][j-arr[i]]+arr[i]);
                }
            }

        }

       // System.out.println(costs[arr.length-1][target]);
       /*if(costs[arr.length-1][target]==target) {
           //System.out.println("Sets :");
           //printSets(arr,arr.length-1,target,"");
       } 

       else System.out.println("No such Set found");*/

    } 

    public static void getSubSetSums(int[] arr,int[] sums) {

        int max = -1;
        for(int i=0;i<sums.length;i++) {
            if(max<sums[i]) {
                max = sums[i];
            }
        }

        calSets(max, arr);

        for(int i=0;i<sums.length;i++) {
            if(costs[arr.length-1][sums[i]]==sums[i]) {
                System.out.println("subset forming "+sums[i]+":");
                printSets(arr,arr.length-1,sums[i],"");
            }
        }




    }

    public static void printSets(int[] arr,int n,int w,String result) {


        if(w==0) {
            System.out.println(result);
            return;
        }

        if(n==0) {
           System.out.println(result+","+arr[0]);
            return; 
        }

        if(costs[n-1][w]==costs[n][w]) {
            printSets(arr,n-1,w,new String(result));
        }
        if(arr[n]<=w&&(costs[n-1][w-arr[n]]+arr[n])==costs[n][w]) {
            printSets(arr,n-1,w-arr[n],result+","+arr[n]);
        }
    }

    public static void main(String[] args) {
        int[] arr = {6, 5, 7, 8, 6, 12, 16};
        int[] sums = {14, 24, 22};        
        getSubSetSums(arr, sums);

    }
}
Vikram Bhat
  • 6,106
  • 3
  • 20
  • 19
  • This will take me some time to absorb as I've never touched java and hadn't heard of constraint programming until tonight :) +1 for now regardless while I wade through this. Thanks! – Shane Feb 21 '14 at 04:49
  • I accepted Xotic750's solution over this one simply because it was implemented in javascript. Your post helped deepened my understanding of the problem domain as this was my first real-world introduction. Good info – Shane Feb 21 '14 at 18:32