I am writing a program that will match up one block(a group of 4 double numbers which are within certain absolute value) with another.
Essentially, I will call the function in main.
The matrix has 4399 rows and 500 columns.I am trying to use OpenMp to speed up the task yet my code seems to have race condition within the innermost loop (where the actual creation of block happens create_Block(rrr[k], i);
).
It is ok to ignore all the function detail as they are working well in serial version. The only focus here is the OpenMP derivatives.
int main(void) {
readKey("keys.txt");
double** jz = readMatrix("data.txt");
int j = 0;
int i = 0;
int k = 0;
#pragma omp parallel for firstprivate(i) shared(Big_Block,NUM_OF_BLOCK,SIZE_OF_COLLECTION,b)
for (i = 0; i < 50; i++) {
printf("THIS IS COLUMN %d\n", i);
double*c = readCol(jz, i, 4400);
#pragma omp parallel for firstprivate(j) shared(i,Big_Block,NUM_OF_BLOCK,SIZE_OF_COLLECTION,b)
for (j=0; j < 4400; j++) {
// printf("This is fixed row %d from column %d !!!!!!!!!!\n",j,i);
int* one_collection = collection(c, j, 4400);
// MODIFY THE DYMANIC ALLOCATION OF SPACES (SIZE_OF_COMBINATION) IN combNonRec() function.
if (get_combination_size(SIZE_OF_COLLECTION, M) >= 4) {
//GET THE 2D-ARRAY OF COMBINATION
int** rrr = combNonRec(one_collection, SIZE_OF_COLLECTION, M);
#pragma omp parallel for firstprivate(k) shared(i,j,Big_Block,NUM_OF_BLOCK,SIZE_OF_COLLECTION,b)
for (k = 0; k < get_combination_size(SIZE_OF_COLLECTION, M); k++) {
create_Block(rrr[k], i); //ACTUAL CREATION OF BLOCK !!!!!!!
printf("This is block %d \n", NUM_OF_BLOCK);
add_To_Block_Collection();
}
free(rrr);
}
free(one_collection);
}
//OpenMP for j
free(c);
}
// OpenMP for i
collision();
}
Here is the parallel version result: non-deterministic
Whereas the serial result has constant 400 blocks.
Big_Block,NUM_OF_BLOCK,SIZE_OF_COLLECTION are global variable.
Did I do anything wrong in the derivative declaration? What might have caused such problem?