I have a piece of code that runs millions of times in a program, so it is performance critical. One thing that happens inside this function is the update of a single element in a vector, V, which I need to keep track of the sum, S, of its elements. I have gained great performance boost by calculating the element sum once in the creation of the vector and then updating the sum by removing the current value of the updated element and then summing the new value as in
S -= V[i];
S += newValue;
V[i] = newValue;
instead of doing the whole sum again with, say, std::accumulate(V.begin(), V.end(),0.0)
. In another part of the program, I then generate a value using the sum S as the range of the possible outcomes and this generated value is later found to be out of bounds. My question then is: is there any loss of precision on storing the sum of elements this way? Another way to put it is: after enough realizations of this update procedure, would the value from S drift away from the value returned by 'accumulate'? I am in the process of testing this hypothesis right now but would appreciate some insight from someone with a better understanding of machine operations. If there is indeed loss of precision and that is causing the errors, is there a fast but more precise way of storing the sum S?
Edit: The data type stored in V is double: std::vector<double> V;
and the sum of its elements are also stored in a double variable S: double S;
Edit2: As requested, here is a more detailed view on the procedure I'm doing with the summation:
#include <random>
#include <vector>
#include <iostream>
int chooseUpdateIndex(
// here we choosen some element of V to be updated. The probability
// of choosing index 'i' is just 'V[i]/S'
std::uniform_real_distribution<double>& uniform,
std::default_random_engine& rng,
double S,
std::vector<double>& V)
{
double randomLength = uniform(rng)*S;
double partialSum = 0;
for (int i = 0; i < V.size(); ++i) {
partialSum += V[i];
if (randomLength < partialSum) return i;
}
throw std::runtime_error("reached end of function without selecting an index");
}
int main()
{
std::default_random_engine rng;
std::uniform_real_distribution<double> uniform(0.0, 1.0);
std::vector<double> V(200);
for (auto& v : V) v = 10*uniform(rng); // populate vector
double S = std::accumulate(V.begin(), V.end(), 0.0); // make initial sum
for (int i = 0; i < 1000000; ++i) {
// sometimes, with enough iterations, this function call throws an error
// because the randomLength is greater than the length of vector V
int updateIndex = chooseUpdateIndex(uniform, rng, S, V);
double newValue = uniform(rng)*10;
S -= V[updateIndex];
S += newValue;
V[updateIndex] = newValue;
}
double recalculatedS = std::accumulate(V.begin(), V.end(), 0.0);
if (S == recalculatedS) std::cout << "S == recalculatedS\n";
else if (S < recalculatedS) std::cout << "S < recalculatedS\n";
else std::cout << "S > recalculatedS\n";
return 0;
}
Running this example I usually get that S < recalculatedS, indicating rounding errors making the value smaller. But if S > recalculatedS, then there is a chance that chooseUpdateIndex function will throw an error.