I am trying to write a simulation where different threads need to perform a given calculation on a thread-specific interval (in the minimal example here that interval is between 1 and 4) based on an atomic simulation time managed by a parent thread.
The idea is to have the parent advance the simulation by a single time step (in this case always 1 for simplicity) and then have all the threads independently check if they need to do a calculation and once they have checked decrement an atomic counter and wait until the next step. I expect that after running this code the number of calculations for each thread would be exactly the length of the simulation (i.e. 10000 steps) divided by the thread-specific interval (so for thread interval of 4 the thread should do exactly 2500 calculations.
#include <thread>
#include <iostream>
#include <atomic>
std::atomic<int> simTime;
std::atomic<int> tocalc;
int end = 10000;
void threadFunction(int n);
int main() {
int nthreads = 4;
std::thread threads[nthreads];
for (int ii = 0; ii < nthreads; ii ++) {
threads[ii] = std::thread(threadFunction, ii+1);
}
simTime = 0;
tocalc = 0;
while (simTime < end) {
tocalc = nthreads - 1;
simTime += 1;
// do calculation
while (tocalc > 0) {
// wait until all the threads have done their calculation
// or at least checked to see if they need to
}
}
for (int ii = 0; ii < nthreads; ii ++) {
threads[ii].join();
}
}
void threadFunction(int n) {
int prev = simTime;
int fix = prev;
int ncalcs = 0;
while (simTime < end) {
if (simTime - prev > 0) {
prev = simTime;
if (simTime - fix >= n) {
// do calculation
ncalcs ++;
fix = simTime;
}
tocalc --;
}
}
std::cout << std::to_string(n)+" {ncalcs} - "+std::to_string(ncalcs)+"\n";
}
However, the output is not consistent with that expectation, one possible output is
2 {ncalcs} - 4992
1 {ncalcs} - 9983
3 {ncalcs} - 3330
4 {ncalcs} - 2448
While the expected output is
2 {ncalcs} - 5000
1 {ncalcs} - 10000
3 {ncalcs} - 3333
4 {ncalcs} - 2500
I am wondering if anyone has insight as to why this method of forcing the threads to wait for the next step seems to be failing - if it is perhaps a simple issue with my code or if it is a more fundamental problem with the approach. Any insight is appreciated, thanks.
Note
I am using this approach because the overhead for other methods I have tried (e.g. using pipes
, joining at each step) is prohibitively expensive, if there is a less expensive way of communicating between the threads I am open to such suggestions.