16

I saw this question in a programming interview blog.

If pairwise sums of n numbers are given in non-decreasing order identify the individual numbers. If the sum is corrupted print -1.

Example:

i/p: 4 5 7 10 12 13 

o/p: 1 3 4 9

A hint would suffice.

ash
  • 1,170
  • 1
  • 15
  • 24

6 Answers6

11

Let B be the list of pairwise sums, with B[0] <= B[1] <= ... <= B[m-1] and let A be the original list of numbers that we're trying to find, with A[0] < A[1] < ... < A[n-1], where m = n(n-1)/2.

Given A[0], compute A in polynomial time

Build A up from smallest element to largest. Suppose that we already know A[0]. Then, since B[0] is the smallest element in B, it can only arise as A[0] + A[1]. Similarly, B[1] must equal A[0] + A[2]. Therefore, if we know A[0], we can compute A[1] and A[2].

After that, however, this pattern breaks down. B[2] could either be A[0] + A[3] or A[1] + A[2] and without prior knowledge, we cannot know which one it is. However, if we know A[0], we can compute A[1] and A[2] as described above, and then remove A[1] + A[2] from B. The next smallest element is then guaranteed to be A[0] + A[3], which allows us to find A[3]. Continuing like this, we can find all of A without ever backtracking. The algorithm looks something like this:

for i from 1 to n-1 {
    // REMOVE SEEN SUMS FROM B
    for j from 0 to i-2 {
        remove A[j]+A[i-1] from B
    }
    // SOLVE FOR NEXT TERM
    A[i] = B[0] - A[0]
}
return A

Here's how this works from your example where B = [4,5,7,10,12,13] if we know A[0]=1:

start
    B = [4,5,7,10,12,13]
    A[0] = 1

i=1: 
    B = [4,5,7,10,12,13]
    A[1] = 4-1 = 3

i=2:
    Remove 1+3 from B
    B = [5,7,10,12,13]
    A[2] = 5-1 = 4

i=3:
    Remove 1+4 and 3+4 from B
    B = [10,12,13]
    A[3] = 10-1 = 9

end
    Remove 1+9 and 3+9 and 4+9 from B
    B = []
    A = [1,3,4,9]

So it all comes down to knowing A[0], from which we can compute the rest of A.

Compute A[0] in polynomial time

We can now simply try every possibility for A[0]. Since we know B[0] = A[0] + A[1], we know A[0] must be an integer between 0 and B[0]/2 - 1. We also know that

B[0] = A[0] + A[1]
B[1] = A[0] + A[2]

Moreover, there is some index i with 2 <= i <= n-1 such that

B[i] = A[1] + A[2]

Why? Because the only entries potentially smaller than A[1]+A[2] are of the form A[0] + A[j], and there are at most n-1 such expressions. Therefore we also know that

A[0] = (B[0]+B[1] - B[i])/2

for some 2 <= i <= n-1. This, together with the fact that A[0] lies between 0 and B[0]/2-1 gives only a few possibilities for A[0] to test.

For the example, there are two possibilities for A[0]: 0 or 1. If we try the algorithm with A[0]=0, here's what happens:

start
    B = [4,5,7,10,12,13]
    A[0] = 0

i=1: 
    B = [4,5,7,10,12,13]
    A[1] = 4-0 = 4

i=2:
    Remove 0+4 from B
    B = [5,7,10,12,13]
    A[2] = 5-0 = 5

i=3:
    Remove 0+5 and 4+5 from B
    B = !!! PROBLEM, THERE IS NO 9 IN B!

end
PengOne
  • 48,188
  • 17
  • 130
  • 149
  • Thank you Pengone. I am trying to understand your solution. So for finding A[0] we have to try every value between 0 and B[0]/2 and we will know its not right when we proceed with the algorithm and find we dont have a solution right? or am i missing something? – ash Dec 19 '11 at 22:04
  • @ash Yes. I've just added in examples that I hope will help. Please let me know if it's still unclear. – PengOne Dec 19 '11 at 22:12
  • Thanks again Pengone. I got it and it is a reasonable approach. But I am still wondering if the first sum is huge enough we would have to do a number of iterations through the algorithm to find A[0]. I am thinking if we use some sort of a binary search approach and if one subtraction gives a negative value then A[0] should be less than that and so on..does that make sense? – ash Dec 19 '11 at 22:25
  • 2
    If we know A[0], then we can subtract it from every value of A and also subtract *twice* it from B. This is fully equivalent. Then, because A[0] is 0, we know that B includes every element of A (except for that first 0 of course). Does this help? – Aaron McDaid Dec 19 '11 at 22:36
  • @AaronMcDaid - That's a very nice insight! – Ted Hopp Dec 19 '11 at 23:16
  • @Aaron just had a chance to think about your observation. That should do the trick! Thanks – ash Dec 19 '11 at 23:33
  • @ash You can find `A[0]` in `O(n)` tries. Please see my revisions above. – PengOne Dec 19 '11 at 23:38
  • @AaronMcDaid Good idea. See my revisions above. – PengOne Dec 19 '11 at 23:38
  • @PengOne Look great now! Thank you.Let me think about it more – ash Dec 19 '11 at 23:48
  • Given an estimate of `A[0]` which fails, is it possible to tell why it failed? In particular, whether it was too high or too low? If this is possible then we could do some sort of binary search. For example, if `A[0]` is too low, then our estimates of `A[1]` and `A[2]` will be too high. – Aaron McDaid Dec 20 '11 at 00:13
  • 1
    @AaronMcDaid This was ash's idea. I'm not certain if that will work. Also, as Ted points out, `B[0]/2-1` could be far greater than `n`, so even a binary search there might be worse than `O(n)`. – PengOne Dec 20 '11 at 00:20
  • Doh! I see @ash's comment now about the binary search. Thanks. – Aaron McDaid Dec 20 '11 at 00:25
  • @Aaron I have to agree with PengOne and Ted on this. Though I dont see an intuitive way to use estimates of A[0],A[1] and A[2] we could use a overestimation of it to find the approximate position of A[1]+A[2] in the B array and then maybe use that to see if future estimates of A[1] and A[2] are indeed overestimates or maybe even use that to solve A[0],A[1] and A[2]. I dont know how it will turn out though – ash Dec 20 '11 at 01:05
1

Some hints:

  • The size of the input is N*(N-1)/2, so you can deduce the size of the output (i.e. 6 elements in the input correspond to 4 elements in the output)

  • The sum of the input is the sum of the output divided by N - 1 (i.e. 1+3+4+9 = (4+5+7+10+12+13) / (4-1))

  • The lowest input and highest inputs are the sum of the two lowest and two highest outputs respectively (i.e. 4 = 1 + 3 and 13 = 4 + 9)

  • The next lowest input (5) is differs by only one addend from the first (1), so you can compute one of the addends by taking the difference (5-1).

Raymond Hettinger
  • 216,523
  • 63
  • 388
  • 485
  • 1
    I am unable to figure how one could use a+b+c+d=something, a+b=something and c+d=something to solve for a...or am i missing something in your fourth hint... – ash Dec 19 '11 at 20:27
1

Ferdinand Beyer was on the right track, I think, before he deleted his answer. To repeat part of his approach: you have four unknowns, a, b, c, and d with a ≤ b ≤ c ≤ d. From this, one can form a partial ordering of all the sums:

a + b ≤ a + c
a + b ≤ a + d
a + c ≤ b + c
a + d ≤ b + d
a + d ≤ c + d
b + c ≤ b + d
b + d ≤ c + d

If this were a total order, then one would know each of the six values a + b, a + c, a + d, b + c, b + d, and c + d. One could then follow Ferdinand's original plan and easily solve the simultaneous equations.

Unfortunately, there is the pair (a + d, b + c), which can be ordered either way. But this is easy enough to handle: assume that a + d < b + c (the input values are all distinct, so one need not worry about using ≤) and try to solve the simultaneous equations. Then assume b + c < a + d and repeat. If both sets of equations have a solution, then the original problem has two answers. If neither set has a solution, then the output should be -1. Otherwise, you have your (unique) solution.

Ted Hopp
  • 232,168
  • 48
  • 399
  • 521
  • I am sorry but the input order is non decreasing so they might not be distinct right? and if the input is say a bigger number meaning the original number set is more then the partially ordered pairs become more in number right? I am still thinking but your solution is obvious to me – ash Dec 19 '11 at 20:24
  • 2
    @ash - My mistake; I was working off the specific set of numbers you gave. But nothing is lost. Just assume `a + d ≤ b + c` and vice versa. The only complication is a trivial one: the two-solution case may collapse to a single solution. – Ted Hopp Dec 19 '11 at 20:34
  • @ash - If the input includes repeated numbers when two pairwise sums are equal, then you always know how many numbers need to be in the output. dealing with longer input becomes more complex but can be handled, I think, in a similar way: determine the partial ordering, then investigate each compatible total ordering for a solution. The number of computations grows, but the recipe is still simple. – Ted Hopp Dec 19 '11 at 20:40
  • I am trying to work out your solution for a six size array. a+b,a+c,a+d,a+e,a+f,b+c,b+d,b+e,b+f,c+d,c+e,c+f,e+f.Now as u said first element is always a+b. second element is always a+c. Now for the third element it is either b+c or a+d. Fine so far so good. Now after this we have another inequality. b+c,a+e Which means it could a+d b+c a+e or a+d a+e b+c or b+c a+d a+e Next we have b+d which again no ordering with a+e. so the following possible pairs result a+d b+c b+d a+e or a+d b+c a+e b+d or b+c a+d a+e b+d or b+c a+d b+d a+e and so on. then c+d which again has no ordering with a+d. – ash Dec 19 '11 at 22:16
  • am I thinking right? If this is the case we will need to resolve the entire set of sums to come up with all possible partial order and then we have to use your technique to verify each potential path right? Isnt it computationally infeasible? Pls tell me where I am missing the trick – ash Dec 19 '11 at 22:16
  • @ash - There is indeed a combinatorial explosion. The general problem is, I think, NP-hard. But I think that this approach, although exponential, produces correct results. – Ted Hopp Dec 19 '11 at 22:24
  • I am pretty sure your approach is indeed right. I will get back if I could find some exceptions. Thank you! – ash Dec 19 '11 at 22:28
  • 1
    @TedHopp I gave a polynomial time solution, so this is NP-hard if and only if P=NP. – PengOne Dec 19 '11 at 22:35
  • @PengOne - Good point. I'm not 100% sure that your approach is polynomial time. The work to identify A[0] seems proportional to the magnitude of A[0], which would make it exponential (since it only takes log(A[0]) bits to represent A[0]). However, if the binary search trick works, that would take care of this objection. In view of this and also Aaron's comment to your solution, I'm now thinking that the problem is probably in P. – Ted Hopp Dec 19 '11 at 23:11
  • @TedHopp You're quite correct. I just updated my solution to show there are at most `max(B[0]/2-1, n-2)` possible values for `A[0]`, so it is indeed polynomial time. – PengOne Dec 19 '11 at 23:39
  • @PengOne shouldn't it be min(B[0]/2-1,n-2)values ? – ash Dec 20 '11 at 00:13
1

PengOne's approach to recovering A given A[0] and B is good, but there is a better way to compute A[0]. Note that the two smallest elements of B are:

B[0] = A[0] + A[1]
B[1] = A[0] + A[2]

and

B[i] = A[1] + A[2]

for some i.

Therefore,

A[0] = (B[0] + B[1] - B[i]) / 2

for some i, and we simply need to try O(n^{1/2}) possibilities, since i is bounded by O(n^{1/2}), and see if one leads to a valid setting of the remaining elements of A per PengOne's solution. Total running time is O(n^{3/2}), where n is the number of numbers in the input.

jonderry
  • 23,013
  • 32
  • 104
  • 171
0

Recently I was checking interview questions and I solved the problem with the help of @PengOne's hint for the finding first value,

So if anyone needs to a complete working solution : It's in PHP :

time complexity : O( (n * (n-2)) + 3 + n) with helper variables. space complexity : almost same with time complextiy.

<?php
function getSublistSize($length)
{
    $i = 2;
    $n = 0;

    while ($i <= $length) {
        if (is_int($length / $i)) {
            if ($length == $i * ($i + 1) / 2) {
                return ($i + 1);
            }
        }

        ++$i;
    }

    return $n;
}

function findSubstractList(array $list)
{
    $length = count($list);

    $n = getSublistSize($length);
    $nth = $n - 1;

    $substractList = [];
    $substractTotal = array_sum($list) / ($length / 2); // A + B + C + D

    /**
     * formula : A = (list[0] + list[1] - list[nth -1]) / 2
     * list[0] = A + B,
     * list[1] = A + C,
     * list[nth - 1] = B + C
     *
     * =>  ((A + B) + (A + C) - (B + C)) / 2
     * => (A + A + (B + C - B - C)) / 2
     * => (2A + 0) / 2 => 2A / 2
     * => A
     */
    $substractList[] = (($list[0] + $list[1]) - $list[$nth]) / 2;

    for ($i = 0; $i < $nth; ++$i) {
        $substractList[] = ($list[$i] - $substractList[0]);
    }

//    $substractList[3] = $substractTotal - ($list[$nth - 1] + $substractList[0]);


    return $substractList;
}


$list = [5, 8, 14, 28, 40, 11, 17, 31, 43, 20, 34, 46, 40, 52, 66];

print_r(findSubstractList($list));

/**
 * P ) [6, 11, 101, 15, 105, 110];
 * S ) [1, 5, 10, 100]
 *
 * P ) [5, 8, 14, 28, 40, 11, 17, 31, 43, 20, 34, 46, 40, 52, 66];
 * S ) [1, 4, 7, 13, 27, 39]
 *
*/
FZE
  • 1,587
  • 12
  • 35
-2

I am not sure about the fastest algorithm, but I can explain how this works.

The first number of the o/p, is the difference between the first and second i/p

5-4=1

, so now you have you first o/p number.

The second number of o/p is the first i/p minus the first o/p.

4-1=3

third of o/p is second o/p minus first i/p

5-1=4
Mateen Ulhaq
  • 24,552
  • 19
  • 101
  • 135
Isaac Fife
  • 1,659
  • 13
  • 15