using openACC directive inside interoperation region with CUDA

Question

is there any ways to further parallelize the loop in the following compute region, replacing #pragma acc loop directive is ingored by the PGI 18.1

#pragma acc host_data use_device(ptr)
    {
     cufftPlanMany( &plan, rank, ss  , &inembed, istride, idist, &onembed, ostride, odist, CUFFT_Z2Z, F.length[0]);
    // this loop
    for(int i=0;i<length[2];i++)
    {
     cufftExecZ2Z( plan, (cufftDoubleComplex *)(ptr+i*length[0]*length[1]), (cufftDoubleComplex *)(ptr+i*length[0]*length[1]), CUFFT_INVERSE );
    }
     cufftDestroy(plan);
    }

Does cuda plan many already takes care of this issue?

score 0 · Answer 1 · answered Mar 07 '18 at 16:35

0

I think, assuming that cufft uses full amount of computational power of the GPU, further parallelizing this might not even make sense,

answered Mar 07 '18 at 16:35

JimBamFeng

709
1
4
20

using openACC directive inside interoperation region with CUDA

1 Answers1