-3

Here's my solution to interviewbit problem. link

You are given a read only array of n integers from 1 to n. Each integer appears exactly once except A which appears twice and B which is missing. Return A and B. Note: Your algorithm should have a linear runtime complexity. Could you implement it without using extra memory? Note that in your output A should precede B. N <= 10^5

It looks like there's an overflow problems somewhere. Could you point out such places and suggest fixes.

  typedef long long int unit;

vector<int> Solution::repeatedNumber(const vector<int> &A) {
    unit n = A.size();
    unit sum = n*(n+1)/2;
    unit sumsq = n*(n+1)*(2*n+1)/6;
    unit arrsum = std::accumulate(A.begin(), A.end(), 0);


    unit arrsq = 0;
    for(int item : A) {
        arrsq += (unit)item*item;
    }

    unit c1 = arrsum - sum;

    unit c2 = arrsq - sumsq;

    unit a = (c2/c1 + c1);
    a/=2;

    unit b = (c2/c1 - c1);
    b/=2;

    return {a, b};
}

P.S It gotta be overflow problem because the same solution works in Python.

Update Here's solution provided by authors of a problem. It's interesting how he fights the overflow problem in summation by subtracting.

 class Solution {
public:
    vector<int> repeatedNumber(const vector<int> &V) {
       long long sum = 0;
       long long squareSum = 0;
       long long temp;
       for (int i = 0; i < V.size(); i++) {
           temp = V[i];
           sum += temp;
           sum -= (i + 1);
           squareSum += (temp * temp);
           squareSum -= ((long long)(i + 1) * (long long)(i + 1));
       }
       // sum = A - B
       // squareSum = A^2 - B^2 = (A - B)(A + B)
       // squareSum / sum = A + B
       squareSum /= sum;

       // Now we have A + B and A - B. Lets figure out A and B now. 
       int A = (int) ((sum + squareSum) / 2);
       int B = squareSum - A;

       vector<int> ret;
       ret.push_back(A);
       ret.push_back(B);
       return ret;
    }
};
Dmitry S.
  • 23
  • 3
  • 8

2 Answers2

2

The problem is this:

unit arrsum = std::accumulate(A.begin(), A.end(), 0);

You need to use 0LL to make it accumulate the values as long long.

Code that demonstrates the problem:

int main()
{
    vector<int> A;
    for (int i = 0; i < 1000000; ++i)
        A.push_back(1000000);

    long long arrsum = accumulate(A.begin(), A.end(), 0LL);
    cout << arrsum;

    return 0;
}

Outputs -727379968 without the LL and the correct result with it.

Note that you can also use accumulate to compute the sum of squares:

unit arrsq = accumulate(A.begin(), A.end(), 0LL, 
                             [](unit x, unit y) { return x + y*y; });
IVlad
  • 43,099
  • 13
  • 111
  • 179
  • This is a horrible example of how C++ can shoot an innocent programmer in the leg. I hope that the compiler at least generated a warning. – Ophir Gvirtzer Jun 07 '15 at 14:04
  • @OphirGvirtzer - why? I use C++ very little but it immediately jumped at me that he was using `accumulate` on `int`s. What else could it return other than an `int`? Making the initial value `0LL` might not be so intuitive, but changing the initial array to `long long` is intuitive and would have worked too. – IVlad Jun 07 '15 at 14:14
  • 2
    @lVlad You're right on this example. But the same happens when the array is long long if not using 0LL, I consider this quite terrible (and VC++ doesn't warn). – Ophir Gvirtzer Jun 07 '15 at 14:20
  • @OphirGvirtzer oh, I thought it would work with just long long... that is pretty weird indeed. – IVlad Jun 07 '15 at 14:25
  • Only a few can understand the rules for template type deduction... – Ophir Gvirtzer Jun 07 '15 at 14:28
0

The potential overflow problems are:

unit sum = n*(n+1)/2;

here the maximum n value is 10^5. Hence, n*(n+1) will yield 10^10 and then computes the division due to operator precedence.

The second place is

unit sum = n*(n+1)(2*n+1)/6;

the intermediate value computed here goes upto 10^15.

Also there is integer overflow in the where you are computing the sum of squares of all the numbers.

Nivetha
  • 698
  • 5
  • 17
  • `unit` is typedef-ed to `long long`, so those shouldn't overflow. – IVlad Jun 07 '15 at 13:19
  • There sholdn't be an overflow in these 3 expressions. long long int is defined "Not smaller than long. At least 64 bits." a 64 signed bit can hold 10^18.9, while (n+1)*(n+1)(2n+1) is 10^16 by the worst case. – Ophir Gvirtzer Jun 07 '15 at 13:23