I am porting one large MPI-based physics code to OpenMP tasking. On one Cray supercomputing machine the code compiled, linked and runs perfectly (cray-mpich library, Cray compiler were used for this). Then, the code moved to a server for Jenkins continuous integration (I don't have admin rights on that server), and there is only GCC v.4 compiler (Cray compiler can't be used as it's not a Cray machine). On that server my code is not compiled, there is an error:
... error: ‘pcls’ implicitly determined as ‘firstprivate’ has reference type
#pragma omp task
^
It's a spaghetti code, so it's hard to copy-paste here the code lines caused this error, but my guess is that this is due to the problem described here: http://forum.openmp.org/forum/viewtopic.php?f=5&t=117
Is there any possibility to solve this issue? It seems like with GCC v.6 this was resolved, but not sure... I am curious if someone has this situation...
UPD: I am providing the skeleton of one function, where one such error is caused (sorry for long listing!):
void EMfields3D::sumMoments_vectorized(const Particles3Dcomm* part)
{
grid_initialisation(...);
#pragma omp parallel
{
for (int species_idx = 0; species_idx < ns; species_idx++)
{
const Particles3Dcomm& pcls = part[species_idx];
assert_eq(pcls.get_particleType(), ParticleType::SoA);
const int is = pcls.get_species_num();
assert_eq(species_idx,is);
double const*const x = pcls.getXall();
double const*const y = pcls.getYall();
double const*const z = pcls.getZall();
double const*const u = pcls.getUall();
double const*const v = pcls.getVall();
double const*const w = pcls.getWall();
double const*const q = pcls.getQall();
const int nop = pcls.getNOP();
#pragma omp master
{
start_timing_for_moments_accumulation(...);
}
...
#pragma omp for // because shared
for(int i=0; i<moments1dsize; i++)
moments1d[i]=0;
// prevent threads from writing to the same location
for(int cxmod2=0; cxmod2<2; cxmod2++)
for(int cymod2=0; cymod2<2; cymod2++)
// each mesh cell is handled by its own thread
#pragma omp for collapse(2)
for(int cx=cxmod2;cx<nxc;cx+=2)
for(int cy=cymod2;cy<nyc;cy+=2)
for(int cz=0;cz<nzc;cz++)
#pragma omp task
{
const int ix = cx + 1;
const int iy = cy + 1;
const int iz = cz + 1;
{
// reference the 8 nodes to which we will
// write moment data for particles in this mesh cell.
//
arr1_double_fetch momentsArray[8];
arr2_double_fetch moments00 = moments[ix][iy];
arr2_double_fetch moments01 = moments[ix][cy];
arr2_double_fetch moments10 = moments[cx][iy];
arr2_double_fetch moments11 = moments[cx][cy];
momentsArray[0] = moments00[iz]; // moments000
momentsArray[1] = moments00[cz]; // moments001
momentsArray[2] = moments01[iz]; // moments010
momentsArray[3] = moments01[cz]; // moments011
momentsArray[4] = moments10[iz]; // moments100
momentsArray[5] = moments10[cz]; // moments101
momentsArray[6] = moments11[iz]; // moments110
momentsArray[7] = moments11[cz]; // moments111
const int numpcls_in_cell = pcls.get_numpcls_in_bucket(cx,cy,cz);
const int bucket_offset = pcls.get_bucket_offset(cx,cy,cz);
const int bucket_end = bucket_offset+numpcls_in_cell;
some_manipulation_with_moments_accumulation(...);
}
}
#pragma omp master
{
end_timing_for_moments_accumulation(...);
}
// reduction
#pragma omp master
{
start_timing_for_moments_reduction(...);
}
{
#pragma omp for collapse(2)
for(int i=0;i<nxn;i++)
{
for(int j=0;j<nyn;j++)
{
for(int k=0;k<nzn;k++)
#pragma omp task
{
rhons[is][i][j][k] = invVOL*moments[i][j][k][0];
Jxs [is][i][j][k] = invVOL*moments[i][j][k][1];
Jys [is][i][j][k] = invVOL*moments[i][j][k][2];
Jzs [is][i][j][k] = invVOL*moments[i][j][k][3];
pXXsn[is][i][j][k] = invVOL*moments[i][j][k][4];
pXYsn[is][i][j][k] = invVOL*moments[i][j][k][5];
pXZsn[is][i][j][k] = invVOL*moments[i][j][k][6];
pYYsn[is][i][j][k] = invVOL*moments[i][j][k][7];
pYZsn[is][i][j][k] = invVOL*moments[i][j][k][8];
pZZsn[is][i][j][k] = invVOL*moments[i][j][k][9];
}
}
}
}
#pragma omp master
{
end_timing_for_moments_reduction(...);
}
}
}
for (int i = 0; i < ns; i++)
{
communicateGhostP2G(i);
}
}
Please, don't try to find a logic here (like why there is "#pragma omp parallel" and then the for-loop appears without "#pragma omp for"; or why in a for-loop there is a task construct)... I was not implementing the code, but I has to port it to OpenMP tasking...