4

Given a set S of n positive integers, we want to know if we can find a combination of signs for each of the numbers in S (+ or -) such that the sum of S is 0.

How can one efficiently solve this problem? Based on similar problems, I'd imagine some kind of dynamic programming is in order. Is there any literature on this specific problem (I am having trouble finding it).

I guess this is similar to the subset sum problem. However, now we have to use the entire set, and for each integer si we can include -si or +si, but not both.

Anatolii
  • 14,139
  • 4
  • 35
  • 65
Simon H
  • 374
  • 2
  • 14

2 Answers2

4

The solution to this problem involves the subset sum problem.

If there exists a way to sum to half of the total sum of the array, then we can set all of those numbers to be negative. The rest of the numbers will then be positive. Because each of these subsets sum to half of the total sum, their respective sum will thus be 0.

Here is the code in c++:

#include<stdio.h>

int arr[] = {1, 2, 2, 3, 4};
int n = 5; // size of arr
int sum = 0;

// dp array only needs to be [n + 1][total sum + 1] big
bool dp[30][100];
inline void subset_sum(){
    for (int i = 0; i <= sum; i++)
        dp[0][i] = false;

    for (int i = 0; i <= n; i++)
        dp[i][0] = true;

    for (int i = 1; i <= n; i++) {
        for (int j = 1; j <= sum; j++) {
            dp[i][j] = dp[i - 1][j];
            if (arr[i - 1] <= j)
                dp[i][j] |= dp[i - 1][j - arr[i - 1]];
        }
    }
}
int main(){
    for (int i = 0; i < n; i++)
        sum += arr[i];

    // run subset sum dp using a bottom-up approach
    // True = sum is possible, False = not possible
    subset_sum();

    int max_half;
    for (int i = sum / 2; i>=1; i--){
        if (dp[n][i]){ // it is possible to sum to i using values in arr
            max_half = i;
            break;
        }
    }

    // output will be the closest sum of positives
    // and negatives to 0
    printf("%d\n", 2 * max_half - sum);

    return 0;
}

The output for this code would be closest possible sum of combinations of positives and negative numbers in the set to 0.

The 2 * max_half - sum can be derived from max_half - (sum - max_half), which would be our best possible sum minus the rest of the numbers.

Here are some examples of different sets of numbers and their respective outputs:

Set: {1, 2, 2, 3, 4}, output: 0.

Set: {1, 1, 1, 1, 1}, output: -1.

Set: {5, 2, 6, 8, 9, 2}, output: 0.

Set: {1, 50}, output: -49.


There are many explanations for the subset sum problem on the internet, so I will not explain it here.

The time complexity of this code is O(n * sum), and the space complexity is O(n * sum).

It is also possible to sacrifice some time complexity to improve space complexity, by using a 1 dimensional dp array.

Blackgaurd
  • 608
  • 6
  • 15
  • The idea makes sense, but after trying out something seems to be wrong. Apart from dp[6] being out of range (should be dp[5] I guess), the resulting dp array is [0, 4, 1, 3, 1, 2], which means no combination exists. However, we can clearly form the combination 1 + 2 - 2 + 3 - 4. Any idea what is going on? – Simon H Mar 24 '21 at 13:05
  • Taking a look into it, the dp condition I chose to use may not be the best choice for this situation. It's not solely defined by how close the value is to 0. In the meantime, I'm still trying to think of a better condition to use, or maybe an alternate solution. – Blackgaurd Mar 24 '21 at 15:43
  • I have finally figured out a solution to this problem, and have edited my answer. Please take a look. – Blackgaurd Mar 29 '21 at 15:28
3

Given that the problem seem to be NP-complete, using a SAT, MILP, CP or ASP solver is the best choice, as these are tailored to solve these kind of problems.

Solution

Here is a solution using ASP (Answer Set Programming).

Given a file instance.lp:

value(12).
value(12).
value(1).
value(2).
value(3).
value(5).
value(6).
value(7).

and the file encoding.lp:

% every value can be positive (or not)
{pos(X)} :- value(X).

% fail if the sum is not 0
:- not 0 = #sum {V : pos(V); -V : not pos(V), value(V)}.

% format output
#show pos/1.
#show neg(V) : not pos(V), value(V).

the problem can be solved using clingo, an ASP solver of the potassco tool collection (easily installable via conda, pip, Ubuntu Package Manger etc...).

Calling:

clingo instance.lp encoding.lp

gives you the result:

Answer: 1
pos(1) pos(2) pos(3) pos(5) pos(7) neg(6) neg(12)

You can enumerate all possible solutions with:

clingo instance.lp encoding.lp 0

giving you

Answer: 1
pos(1) pos(2) pos(3) pos(5) pos(7) neg(6) neg(12)
Answer: 2
pos(2) pos(3) pos(6) pos(7) neg(5) neg(1) neg(12)
Answer: 3
pos(5) pos(6) pos(7) neg(3) neg(2) neg(1) neg(12)
Answer: 4
pos(12) pos(1) pos(2) pos(3) neg(7) neg(6) neg(5)
Answer: 5
pos(12) pos(6) neg(7) neg(5) neg(3) neg(2) neg(1)
Answer: 6
pos(12) pos(1) pos(5) neg(7) neg(6) neg(3) neg(2)

ASP

Using ASP to solve the problem has the advantage of:

  • beeing easily maintainable (very short description of the problem, easy to read)
  • very fast (based on SAT and CDNL)
  • declarative (you only describe the problem, not how to solve it)
  • easily extensible with other constraints
  • also able to do all kinds of optimization (like optimizing for the biggest subset to form the sum)

Edit You can also copy and paste the content of both files to check out the results yourself online, using a js compilation of clingo here

Max Ostrowski
  • 575
  • 1
  • 3
  • 15