Understanding Sum of subsets

Question

I've just started learning Backtracking algorithms at college. Somehow I've managed to make a program for the Subset-Sum problem. Works fine but then i discovered that my program doesn't give out all the possible combinations.

For example : There might be a hundred combinations to a target sum but my program gives only 30. Here is the code. It would be a great help if anyone could point out what my mistake is.

int tot=0;//tot is the total sum of all the numbers in the set.
int prob[500], d, s[100], top = -1, n; // n = number of elements in the set. prob[i] is the array with the set.
void subset()
{
    int i=0,sum=0; //sum - being updated at every iteration and check if it matches 'd'
    while(i<n)
    {
        if((sum+prob[i] <= d)&&(prob[i] <= d)) 
        {
            s[++top] = i;
            sum+=prob[i];
        }
        if(sum == d) // d is the target sum 
        {
            show(); // this function just displays the integer array 's'
            top = -1; // top points to the recent number added to the int array 's'
            i = s[top+1];
            sum = 0;
        }
        i++;
        while(i == n && top!=-1)
        {
            sum-=prob[s[top]];
            i = s[top--]+1;
        }
    }
}

int main()
{
    cout<<"Enter number of elements : ";cin>>n;
    cout<<"Enter required sum : ";cin>>d;
    cout<<"Enter SET :\n";
    for(int i=0;i<n;i++)
    {
        cin>>prob[i];
        tot+=prob[i];
    }
    if(d <= tot)
    {
        subset();
    }
    return 0;
}

When I run the program :

Enter number of elements : 7
Enter the required sum : 12
Enter SET : 
4 3 2 6 8 12 21

SOLUTION 1 : 4, 2, 6
SOLUTION 2 : 12

Although 4, 8 is also a solution, my program doesnt show it. Its even worse with the number of inputs as 100 or more. There will be atleast 10000 combinations, but my program shows 100.

The Logic which I am trying to follow :

Take in the elements of the main SET into a subset as long as the sum of the subset remains less than or equal to the target sum.
If the addition of a particular number to the subset sum makes it larger than the target, it doesnt take it.
Once it reaches the end of the set, and answer has not been found, it removes the most recently taken number from the set and starts looking at the numbers in the position after the position of the recent number removed. (since what i store in the array 's' is the positions of the selected numbers from the main SET).

It would help if your variables had more descriptive names (this will be useful for your programming career in general), or at least if you told us what each of the is supposed to mean, how they're declared, initialised etc. — Angew is no longer proud of SO, Mar 29 '13 at 09:42
Can you add some example input, what you expected the output to be, and what the output was. It's not at all clear how this code *gives* anything, or where it gets it's input from. — john, Mar 29 '13 at 09:42
Sorry for being so vague. This is the first time I m posting a code online. — thekeystroker, Mar 29 '13 at 10:05
The second `while()` loop seems *weird.* In general, I'm having trouble figuring out your algorithm's logic. Can you phrase it clearly in a few natural-language words? — Angew is no longer proud of SO, Mar 29 '13 at 10:25
Ok i just edited my post and added the logic which i m trying to use. The second while loop does the removal of the recently added element to the subset. — thekeystroker, Mar 29 '13 at 11:28

score 1 · Answer 1 · answered Mar 31 '13 at 12:47

1

The solutions you are going to find depend on the order of the entries in the set due to your "as long as" clause in step 1.

If you take entries as long as they don't get you over the target, once you've taken e.g. '4' and '2', '8' will take you over the target, so as long as '2' is in your set before '8', you'll never get a subset with '4' and '8'.

You should either add a possibility to skip adding an entry (or add it to one subset but not to another) or change the order of your set and re-examine it.

answered Mar 31 '13 at 12:47

rlc

2,808
18
23

That's nearly right -- if in fact no solution can be built from 4 and 2, then the inner `while` loop will get rid of the 2 and start searching to the right of it. – j_random_hacker Mar 31 '13 at 14:06
@j_random_hacker the point is that if a solution can be built with 4 and 2, it should still try without 2, to find combinations with 4 without 2. – rlc Mar 31 '13 at 16:55
That's correct, but you're saying something slightly different in your answer. "so as long as '2' is in your set before '8', you'll never get a subset with '4' and '8'" is not true -- you can get a subset with 4 and 8 if there is no way to make a solution with 4 and 2. – j_random_hacker Mar 31 '13 at 18:39
@j_random_hacker I was referring to his example sequence 4 3 2 6 8 ... In that sequence, a combination starting with 4 and 2 is possible, so 2 never gets dropped. You're right that I should have clarified that though.. – rlc Apr 03 '13 at 04:07

j_random_hacker · Answer 2 · 2013-03-31T14:44:40.210

It may be that a stack-free solution is possible, but the usual (and generally easiest!) way to implement backtracking algorithms is through recursion, e.g.:

int i = 0, n;    // i needs to be visible to show()
int s[100];

// Considering only the subset of prob[] values whose indexes are >= start,
// print all subsets that sum to total.
void new_subsets(int start, int total) {
    if (total == 0) show();    // total == 0 means we already have a solution

    // Look for the next number that could fit
    while (start < n && prob[start] > total) {
        ++start;
    }

    if (start < n) {
        // We found a number, prob[start], that can be added without overflow.
        // Try including it by solving the subproblem that results.
        s[i++] = start;
        new_subsets(start + 1, total - prob[start]);
        i--;

        // Now try excluding it by solving the subproblem that results.
        new_subsets(start + 1, total);
    }
}

You would then call this from main() with new_subsets(0, d);. Recursion can be tricky to understand at first, but it's important to get your head around it -- try easier problems (e.g. generating Fibonacci numbers recursively) if the above doesn't make any sense.

Working instead with the solution you have given, one problem I can see is that as soon as you find a solution, you wipe it out and start looking for a new solution from the number to the right of the first number that was included in this solution (top = -1; i = s[top+1]; implies i = s[0], and there is a subsequent i++;). This will miss solutions that begin with the same first number. You should just do if (sum == d) { show(); } instead, to make sure you get them all.

I initially found your inner while loop pretty confusing, but I think it's actually doing the right thing: once i hits the end of the array, it will delete the last number added to the partial solution, and if this number was the last number in the array, it will loop again to delete the second-to-last number from the partial solution. It can never loop more than twice because numbers included in a partial solution are all at distinct positions.

score 1 · Answer 3 · answered Mar 31 '13 at 15:01

I haven't analysed the algorithm in detail, but what struck me is that your algorithm doesn't account for the possibility that, after having one solution that starts with number X, there could be multiple solutions starting with that number.

A first improvement would be to avoid resetting your stack s and the running sum after you printed the solution.

Understanding Sum of subsets

3 Answers3