I've been writing a C program to simulate the motion of n bodies under the influence of gravity. I have a working version that uses a single thread, and I'm attempting to write a version that uses multi-threading with the POSIX pthreads library. Essentially, the program initializes a specified number n of bodies, and stores their randomly selected initial positions as well as masses and radii in an array 'data', made using the pointer-to-pointer method. Data is a pointer in the global scope, and it is allocated the correct amount of memory in the 'populate()' function in main(). Then, I spawn twelve threads (I am using a 6 core processor, so I thought 12 would be a good starting point), and each thread is assigned a set of objects in the simulation. The thread function below calculates the interaction between all objects in 'data' and the object currently being operated on. See the function below:
void* calculate_step(void*index_val) {
int index = * (int *)index_val;
long double x_dist;
long double y_dist;
long double distance;
long double force;
for (int i = 0; i < (rows/nthreads); ++i) { //iterate over every object assigned to this thread
data[i+index][X_FORCE] = 0; //reset all forces to 0
data[i+index][Y_FORCE] = 0;
data[i+index][X_ACCEL] = 0;
data[i+index][X_ACCEL] = 0;
for (int j = 0; j < rows; ++j) { //iterate over every possible pair with this object i
if (i != j && data[j][DELETED] != 1 && data[i+index][DELETED] != 1) { //continue if not comparing an object with itself and if the other object has not been deleted previously.
x_dist = data[j][X_POS] - data[i+index][X_POS];
y_dist = data[j][Y_POS] - data[i+index][X_POS];
distance = sqrtl(powl(x_dist, 2) + powl(y_dist, 2));
if (distance > data[i+index][RAD] + data[j][RAD]) {
force = G * data[i+index][MASS] * data[j][MASS] /
powl(distance, 2); //calculate accel, vel, pos, data for pair of non-colliding objects
data[i+index][X_FORCE] += force * (x_dist / distance);
data[i+index][Y_FORCE] += force * (y_dist / distance);
data[i+index][X_ACCEL] = data[i+index][X_FORCE]/data[i+index][MASS];
data[i+index][X_VEL] += data[i+index][X_ACCEL]*dt;
data[i+index][X_POS] += data[i+index][X_VEL]*dt;
data[i+index][Y_ACCEL] = data[i+index][Y_FORCE]/data[i+index][MASS];
data[i+index][Y_VEL] += data[i+index][Y_ACCEL]*dt;
data[i+index][Y_POS] += data[i+index][Y_VEL]*dt;
}
else{
if (data[i+index][MASS] < data[j][MASS]) {
int temp;
temp = i;
i = j;
j = temp;
} //conserve momentum
data[i+index][X_VEL] = (data[i+index][X_VEL] * data[i+index][MASS] + data[j][X_VEL] * data[j][MASS])/(data[i+index][MASS] + data[i+index][MASS]);
data[i+index][Y_VEL] = (data[i+index][Y_VEL] * data[i+index][MASS] + data[j][Y_VEL] * data[j][MASS])/(data[i+index][MASS] + data[i+index][MASS]);
//conserve center of mass position
data[i+index][X_POS] = (data[i+index][X_POS] * data[i+index][MASS] + data[j][X_POS] * data[j][MASS])/(data[i+index][MASS] + data[i+index][MASS]);
data[i+index][Y_POS] = (data[i+index][Y_POS] * data[i+index][MASS] + data[j][Y_POS] * data[j][MASS])/(data[i+index][MASS] + data[i+index][MASS]);
//conserve mass
data[i+index][MASS] += data[j][MASS];
//increase radius proportionally to dM
data[i+index][RAD] = powl(powl(data[i+index][RAD], 3) + powl(data[j][RAD], 3), ((long double) 1 / (long double) 3));
data[j][DELETED] = 1;
data[j][MASS] = 0;
data[j][RAD] = 0;
}
}
}
}
return NULL;
}
This calculates values for velocity, acceleration, etc. and writes them to the array. Each thread does this once for each object assigned to it (i.e. 36 objects means each thread calculates values for 3 objects). The thread then returns and the main loop jumps to the next time step (usually increments of 0.01 seconds), and the process repeats again. If two balls collide, their masses, momenta and centers of mass are added, and one of the objects' 'DELETED' index in its row in the array is marked with a row. This object is then ignored in all future iterations. See the main loop below:
int main() {
pthread_t *thread_array; //pointer to future thread array
long *thread_ids;
short num_obj;
short sim_time;
printf("Number of objects to simulate: \n");
scanf("%hd", &num_obj);
num_obj = num_obj - num_obj%12;
printf("Timespan of the simulation: \n");
scanf("%hd", &sim_time);
printf("Length of time steps: \n");
scanf("%f", &dt);
printf("Relative complexity score: %.2f\n", (((float)sim_time/dt)*((float)(num_obj^2)))/1000);
thread_array = malloc(nthreads*sizeof(pthread_t));
thread_ids = malloc(nthreads*sizeof(long));
populate(num_obj);
int index;
for (int i = 0; i < nthreads; ++i) { //initialize all threads
}
time_t start = time(NULL);
print_data();
for (int i = 0; i < (int)((float)sim_time/dt); ++i) { //main loop of simulation
for (int j = 0; j < nthreads; ++j) {
index = j*(rows/nthreads);
thread_ids[j] = j;
pthread_create(&thread_array[j], NULL, calculate_step, &index);
}
for (int j = 0; j < nthreads; ++j) {
pthread_join(thread_array[j], NULL);
//pthread_exit(NULL);
}
}
time_t end = time(NULL) - start;
printf("\n");
print_data();
printf("Took %zu seconds to simulate %d frames with %d objects initially, now %d objects.\n", end, (int)((float)sim_time/dt), num_obj, rows);
}
Every time the program runs, I get the following message:
Number of objects to simulate:
36
Timespan of the simulation:
10
Length of time steps:
0.01
Relative complexity score: 38.00
Process finished with exit code -1073740940 (0xC0000374)
which seams to indicate the heap is getting corrupted. I am guessing this has to do with the data array pointer being a global variable, but that was my workaround for only being allowed to pass one arg to the pthreads function.
I have tried stepping through the program with the debugger, and it seems it works when I run it in debug mode (I am using CLion), but not in regular compile mode. Furthermore, when i debug the program and it outputs the values of the data array for the last simulation 'frame', the first chunk of values which were supposed to be handled by the first thread that spawns are unchanged. When I go through it with the debugger however I can see that thread being created in the thread generation loop. What are some issues with this code structure and what could be causing the heap corruption and the first thread doing nothing?