Keeping track of the sum of vector elements without loss

Question

I have a piece of code that runs millions of times in a program, so it is performance critical. One thing that happens inside this function is the update of a single element in a vector, V, which I need to keep track of the sum, S, of its elements. I have gained great performance boost by calculating the element sum once in the creation of the vector and then updating the sum by removing the current value of the updated element and then summing the new value as in

S -= V[i];
S += newValue;
V[i] = newValue;

instead of doing the whole sum again with, say, std::accumulate(V.begin(), V.end(),0.0). In another part of the program, I then generate a value using the sum S as the range of the possible outcomes and this generated value is later found to be out of bounds. My question then is: is there any loss of precision on storing the sum of elements this way? Another way to put it is: after enough realizations of this update procedure, would the value from S drift away from the value returned by 'accumulate'? I am in the process of testing this hypothesis right now but would appreciate some insight from someone with a better understanding of machine operations. If there is indeed loss of precision and that is causing the errors, is there a fast but more precise way of storing the sum S?

Edit: The data type stored in V is double: std::vector<double> V; and the sum of its elements are also stored in a double variable S: double S;

Edit2: As requested, here is a more detailed view on the procedure I'm doing with the summation:

    #include <random>
    #include <vector>
    #include <iostream>

    int chooseUpdateIndex(
    // here we choosen some element of V to be updated. The probability
    // of choosing index 'i' is just 'V[i]/S'
        std::uniform_real_distribution<double>& uniform,
        std::default_random_engine& rng,
        double S,
        std::vector<double>& V)
    {
        double randomLength = uniform(rng)*S;
        double partialSum = 0;
        for (int i = 0; i < V.size(); ++i) {
            partialSum += V[i];
            if (randomLength < partialSum) return i;
        }
        throw std::runtime_error("reached end of function without selecting an index");
    }

    int main()
    {
        std::default_random_engine rng;
        std::uniform_real_distribution<double> uniform(0.0, 1.0);
        std::vector<double> V(200);

        for (auto& v : V) v = 10*uniform(rng); // populate vector
        double S = std::accumulate(V.begin(), V.end(), 0.0); // make initial sum
        for (int i = 0; i < 1000000; ++i) {
            // sometimes, with enough iterations, this function call throws an error
            // because the randomLength is greater than the length of vector V
            int updateIndex = chooseUpdateIndex(uniform, rng, S, V);
            double newValue = uniform(rng)*10;
            S -= V[updateIndex];
            S += newValue;
            V[updateIndex] = newValue;
        }

        double recalculatedS = std::accumulate(V.begin(), V.end(), 0.0);

        if (S == recalculatedS) std::cout << "S == recalculatedS\n";
        else if (S < recalculatedS) std::cout << "S < recalculatedS\n";
        else std::cout << "S > recalculatedS\n";

        return 0;
    }

Running this example I usually get that S < recalculatedS, indicating rounding errors making the value smaller. But if S > recalculatedS, then there is a chance that chooseUpdateIndex function will throw an error.

If the numbers have decimals, I would expect there to be a loss of precision over time. With integers, you mostly have to worry about overflow. — Millie Smith, Jul 08 '17 at 14:42
You are being very vague. Please add an [mcve] to your question. — Ben Steffan, Jul 08 '17 at 14:44
I don't think more code is needed. The question just needs to specify the type of the elements in `V` and the expected minimum and maximum value of `S`. — Millie Smith, Jul 08 '17 at 14:48
Really depends on if the sum is in the range both S and the vector item can contain. — Retired Ninja, Jul 08 '17 at 14:49
Consider looking into the Kahan Summation Algorithm to keep track of the running error. https://stackoverflow.com/a/6699451/2850543. — Millie Smith, Jul 08 '17 at 14:52
This answer linking to a paper is also interesting: https://stackoverflow.com/a/6705598/2850543 — Millie Smith, Jul 08 '17 at 14:58
Note that with float/double, `a + b + c` may differ from `a + c + b`. — Jarod42, Jul 08 '17 at 18:42
Thanks for all the comments, I'll have a look at the links. I have edited the question to include more information. If the vector elements are doubles, would storing their sum in a long double be better than storing it in another double? — Kevin Liu, Jul 08 '17 at 21:19
@BenSteffan I will see about adding an example, it is just that I was not completely sure about rounding errors being the cause of problems. Now that I think this is most likely the problem, I'll try to add a comprehensive piece of working code. Thanks. — Kevin Liu, Jul 08 '17 at 21:24

Keeping track of the sum of vector elements without loss

0 Answers0