10

For a game I'm making I have a situation where I have a list of numbers – say [7, 4, 9, 1, 15, 2] (named A for this) – and another list of numbers – say [11, 18, 14, 8, 3] (named B) – provided to me. The goal is to find all combinations of numbers in A that add up to a number in B. For example:

  • 1 + 2 = 3
  • 1 + 7 = 8
  • 2 + 9 = 11
  • 4 + 7 = 11
  • 1 + 2 + 4 + 7 = 14
  • 1 + 2 + 15 = 18
  • 2 + 7 + 9 = 18

...and so on. (For purposes of this, 1 + 2 is the same as 2 + 1.)

For small lists like this, it's trivial to just brute-force the combinations, but I'm facing the possibility of seeing thousands to tens of thousands of these numbers and will be using this routine repeatedly over the lifespan of the application. Is there any kind of elegant algorithm available to accomplish this in reasonable time with 100% coverage? Failing this, is there any kind of decent heuristics I can find that can give me a "good enough" set of combinations in a reasonable amount of time?

I'm looking for an algorithm in pseudo-code or in any decently popular and readable language (note the "and" there....;) or even just an English description of how one would go about implementing this kind of search.


Edited to add:

Lots of good information provided so far. Thanks guy! Summarizing for now:

  • The problem is NP-Complete so there is no way short of brute force to get 100% accuracy in reasonable time.
  • The problem can be viewed as a variant of either the subset sum or knapsack problems. There are well-known heuristics for both which may be adaptable to this problem.

Keep the ideas coming! And thanks again!

JUST MY correct OPINION
  • 35,674
  • 17
  • 77
  • 99
  • Is there an upper limit of elments or a number size? If you keep taht low it is possible to calculate it without waiting tooo long. – Thariama Jul 05 '10 at 11:00
  • It should be possible to use some heuristic if there are certain constrains that can be leveraged. For example, is the size and/or members of either of the arrays A and B constant? Or maybe the range of the number that you are likely to encounter is limited? – tathagata Jul 05 '10 at 11:05
  • @tathagata: The numbers will never exceed 30 nor go below 1. That would be one constraint. – JUST MY correct OPINION Jul 05 '10 at 11:42
  • @Just ...: The numbers are integers between 1 and 30? Is this about the numbers in A? Can we assume numbers in A are distinct? –  Jul 05 '10 at 16:20
  • If so the problem isn't hard. You can create a list with all combinations for the numbers 1 to 30 and which numbers are needed for each combination. Then you create a list with all numbers that occur in the list `A`. Finally you check for every combination if the needed numbers are in the `occurs-in-A-list`. – rudi-moore Jul 05 '10 at 16:41
  • 1
    @Just... Are the numbers in B also between 1 and 30? If so the problem is tractable, given a hefty, but reasonable, amount of space. – Il-Bhima Jul 05 '10 at 16:43
  • @Il-Bhima: Yes. All numbers involved in all such comparisons will be 1-30. Space will be largely a non-issue. – JUST MY correct OPINION Jul 05 '10 at 17:02
  • @Moron: No, we can't assume the numbers are distinct. It's quite possible to have a list that consists only of 1s. – JUST MY correct OPINION Jul 05 '10 at 17:03

4 Answers4

5

This problem is NP-Complete... This is some variation of the sub-set sum problem which is known to be NP-Complete (actually, the sub-set sum problem is easier than yours).

Read here for more information: http://en.wikipedia.org/wiki/Subset_sum_problem

Protostome
  • 5,569
  • 5
  • 30
  • 45
  • I had a sneaking suspicion that it was NP-hard or worse. Are their heuristics I could apply? – JUST MY correct OPINION Jul 05 '10 at 11:23
  • Yes, of course. First of all, there is an approximation algorithm in the wikipedia article, see if it matches your needs. Second, You can apply the branch and bound methodolgy (or alpha-beta pruning). If you have an upper bound on the largest number on your list, you can also apply some heuristics as well, such as compiling a list of all possible additives to a number t in B. (that can be worthwhile only if t,B are relatively small and A is huge...) – Protostome Jul 05 '10 at 11:37
2

As said in the comments with numbers ranging only from 1 to 30 the problem has a fast solution. I tested it in C and for your given example it only needs miliseconds and will scale very well. The complexity is O(n+k) where n is length of list A and k the length of list B, but with a high constant factor (there are 28.598 possibilites to get a sum from 1 to 30).

#define WIDTH 30000
#define MAXNUMBER 30

int create_combination(unsigned char comb[WIDTH][MAXNUMBER+1], 
                       int n, 
                       unsigned char i, 
                       unsigned char len, 
                       unsigned char min, 
                       unsigned char sum) {
    unsigned char j;

    if (len == 1) {
        if (n+1>=WIDTH) {
            printf("not enough space!\n");
            exit(-1);
        }
        comb[n][i] = sum;
        for (j=0; j<=i; j++)
            comb[n+1][j] = comb[n][j];
        n++;
        return n;
    }

    for (j=min; j<=sum/len; j++) {
        comb[n][i] = j;
        n = create_combination(comb, n, i+1, len-1, j, sum-j);
    }

    return n;
}

int main(void)
{
    unsigned char A[6] = { 7, 4, 9, 1, 15, 2 };
    unsigned char B[5] = { 11, 18, 14, 8, 3 };

    unsigned char combinations[WIDTH][MAXNUMBER+1];
    unsigned char needed[WIDTH][MAXNUMBER];
    unsigned char numbers[MAXNUMBER];
    unsigned char sums[MAXNUMBER];
    unsigned char i, j, k;
    int n=0, m;

    memset(combinations, 0, sizeof combinations);
    memset(needed, 0, sizeof needed);
    memset(numbers, 0, sizeof numbers);
    memset(sums, 0, sizeof sums);

    // create array with all possible combinations
    // combinations[n][0] stores the sum
    for (i=2; i<=MAXNUMBER; i++) {
        for (j=2; j<=i; j++) {
            for (k=1; k<=MAXNUMBER; k++)
                combinations[n][k] = 0;
            combinations[n][0] = i;
            n = create_combination(combinations, n, 1, j, 1, i);
        }
    }

    // count quantity of any summands in each combination
    for (m=0; m<n; m++)
        for (i=1; i<=MAXNUMBER && combinations[m][i] != 0; i++)
            needed[m][combinations[m][i]-1]++;

    // count quantity of any number in A
    for (m=0; m<6; m++)
        if (numbers[A[m]-1] < MAXNUMBER)
            numbers[A[m]-1]++;

    // collect possible sums from B
    for (m=0; m<5; m++)
        sums[B[m]-1] = 1;

    for (m=0; m<n; m++) {
        // check if sum is in B
        if (sums[combinations[m][0]-1] == 0)
            continue;

        // check if enough summands from current combination are in A
        for (i=0; i<MAXNUMBER; i++) {
            if (numbers[i] < needed[m][i])
                break;
        }

        if (i<MAXNUMBER)
            continue;

        // output result
        for (j=1; j<=MAXNUMBER && combinations[m][j] != 0; j++) {
            printf(" %s %d", j>1 ? "+" : "", combinations[m][j]);
        }
        printf(" = %d\n", combinations[m][0]);
    }

    return 0;
}

1 + 2 = 3
1 + 7 = 8
2 + 9 = 11
4 + 7 = 11
1 + 4 + 9 = 14
1 + 2 + 4 + 7 = 14
1 + 2 + 15 = 18
2 + 7 + 9 = 18
rudi-moore
  • 2,650
  • 1
  • 19
  • 16
1

Sounds like a Knapsack problem (see http://en.wikipedia.org/wiki/Knapsack_problem. On that page they also explain that the problem is NP-complete in general.

I think this means that if you want to find ALL valid combinations, you just need a lot of time.

Patrick
  • 23,217
  • 12
  • 67
  • 130
1

This is a small generalization of the subset sum problem. In general, it is NP-complete, but as long as all the numbers are integers and the maximum number in B is relatively small, the pseudo-polynomial solution described in the Wikipedia article I linked should do the trick.

svick
  • 236,525
  • 50
  • 385
  • 514