-1

I was looking at the balanced partitioning problem here and here (problem 7).

The problem basically asks to partition a given array of numbers into 2 subsets (S1 and S2) such that absolute difference between the sums of numbers is S1 ans S2 |sum(S1) - sum(S2)| needs to be minimum. One thing I didn't understand is why doesn't anyone suggest greedy approach:

def balanced_partition(lst):
    idx = 0
    S1 = 0
    S2 = 0
    result_partition=[None]*len(lst)
    while idx < len(lst):
        new_S1 = S1 + lst[idx]
        new_S2 = S2 + lst[idx]
        if abs(new_S1 - S2) < abs(new_S2 - S1):
            result_partition[idx] = 1
            S1 = new_S1
        else:
            result_partition[idx] = 2
            S2 = new_S2
        idx += 1
    print("final sums s1 = {S1} and s2 = {S2} ".format(S1=S1, S2=S2))
    return result_partition

What is wrong with my approach? It seems to pass all the test cases I can come up with.

kmad1729
  • 1,484
  • 1
  • 16
  • 20
  • "It seems to pass most of the test cases". So it fails some test cases? Doesn't that answer your question? – Paul Hankin Mar 07 '17 at 03:54
  • by most of the test cases I meant I couldn't find any arguments against greedy approach and I couldn't (come up)/find with negative test cases myself. Edited my question. – kmad1729 Mar 07 '17 at 15:13
  • How did you search for negative test cases? Nearly every sorted list is a counterexample to your approach being optimal (eg: [1, 2, 3]). – Paul Hankin Mar 07 '17 at 15:34
  • simple [google search](https://www.google.com/search?q=balanced+partition+greedy+approach) wasn't helpful and I didn't really think of sorted case. – kmad1729 Mar 07 '17 at 15:43
  • I don't think you can have tried much -- most unsorted lists are also counterexamples. Even if you only consider lists of length 3, [random.randrange(1000) for _ in xrange(3)] is a counterexample about 1/3 of the time. – Paul Hankin Mar 07 '17 at 15:50

2 Answers2

1

The simple counterexample is [1,1,1,1,1,1,6]. The greedy approach will spread the ones between the two sets, while the optimal solution is [1,1,1,1,1,1],[6].

Rafał Dowgird
  • 43,216
  • 11
  • 77
  • 90
0

There is nothing wrong with your implementation and approach. However if you consider all subsets in this particular problem, you may find a better answer than the greedy output. Even in the wiki page that you shared has some examples.

Probably you already know the difference between those two approaches. Although, greedy algorithm will always give you a pretty good result, so close or maybe equal to the best one, you have to consider all options to be sure. Dynamic programming approach checks all the possible subsets in a way. As it saves results from previously computed sub-problems, it is faster than brute forcing basically.

The question is when to use greedy or dynamic programming approach. I have done some competitive programming and when I see a DP problem (problems like partitioning, subset sum, knapsack and so on), I sometimes come up with a greedy solution immediately because most of the times they are more obvious. People use greedy approach all the time in daily life. Before implementing, I test my algorithm with examples and if I convince myself that this is right approach, I implement it. It is kinda intuitive in some way.

If you find a test case that should have a better answer, most probably it means you have to find a DP solution. If you got WA from judge system, it means you haven't find good test cases but that's okay you don't have to find that exact test case because it won't help you to find a better solution.