I am trying to modify an existing particle method code using OpenACC to run on GPU. The existing code utilizes a 2D dynamic array of struct in c. I need to copy the structure(s) to GPU for further calculation. A code sample is given below:
typedef struct{
int *list; // it is list of particles in a given bucket
int count; // it is the total number of particles in the bucket
} structBucket;
typedef struct{
structBucket **bucket;
int numberOfBuckets[2]; // number of buckets in x- and y- dimensions
} structDomain;
structDomain domain;
// Allocate memory for **bucket
domain.numberOfBuckets[XDIM] = 10; domain.numberOfBuckets[YDIM] = 5;
int iX,iY, capacity;
domain.bucket = (structBucket**)malloc( sizeof(structBucket*) * domain.numberOfBuckets[XDIM] );
for (iX=0 ; iX < domain.numberOfBuckets[XDIM] ; iX++)
domain.bucket[iX] = (structBucket*)malloc( sizeof(structBucket) * domain.numberOfBuckets[YDIM]);
// Calculate domain.bucket[iX][iY].count here using some logic
.
.
.
// Allocate memory for *list
for (iX = 0; iX < domain.numberOfBuckets[XDIM]; iX++)
{
for (iY = 0; iY < domain.numberOfBuckets[YDIM]; iY++)
{
capacity = domain.bucket[iX][iY].count;
if (capacity > 0)
{
domain.bucket[iX][iY].list = (int *)malloc(sizeof(int) * capacity);
}
}
}
After reviewing various sources on the internet, I have come up with the following solution (which might be utterly wrong)"
// It is needed to create the memory for **bucket and *list on GPU.
#pragma acc enter data copyin(domain)
#pragma acc enter data copyin(domain.bucket)
#pragma acc enter data create(domain.bucket[0:domain.numberOfBuckets[XDIM]][0:domain.numberOfBuckets[YDIM]])
for (iX = 0; iX < domain.numberOfBuckets[XDIM]; iX++)
{
for (iY = 0; iY < domain.numberOfBuckets[YDIM]; iY++)
{
#pragma acc enter data create(domain.bucket[iX][iY].list[0:domain.bucket[iX][iY].count])
}
}
It is requested for an advice manual deep copy of **bucket and *list to GPU memory. Is my solution accurate? Could someone suggest improvements or a better solution for manual deep copy of said struct(s).
I am using PGI 19.4 compiler on Windows 10. Many thanks