I am offloading code to a GPU using OpenMP 4.5. So far everything is working on the GPU, except when I try to make parallel sections with private variables that are allocated before I offload.
I am using gcc 7.2.0 and cuda 9.2.88. I am running on CentOS 7 and am compiling it with
gfortran ./testCode.F90 -fopenmp -o ./test
Here is a sample code:
#define LENGTH_X 4
#define LENGTH_Y 4
#define PRINT
program main
use omp_lib
implicit none
real, allocatable :: testVar(:,:)
real :: error = 0
logical :: onCPU
integer :: i, j,k
allocate(testVar(LENGTH_X,LENGTH_Y))
do i = 1, LENGTH_X
testVar(i,:) = i
#ifdef PRINT
print *, testVar(i,:)
#endif
end do
onCPU = omp_is_initial_device()
!$omp target map(tofrom:testVar, onCPU,error)
!$OMP TEAMS DISTRIBUTE PARALLEL DO private(testVar) reduction(max:error)
do i = 2, LENGTH_X-1
do j = 2, LENGTH_Y-1
testVar(i,j) = 0.25
end do
end do
!$OMP END TEAMS DISTRIBUTE PARALLEL DO
onCPU = omp_is_initial_device()
!$omp end target
print *, "Ran on CPU", onCPU
print *, "New vars"
do i = 1, LENGTH_X
#ifdef PRINT
print *, testVar(i,:)
#endif
end do
end program main
This fails to compile with
unresolved symbol _gfortran_os_error
collect2: error: ld returned 1 exit status
mkoffload: fatal error: x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1 exit status
compilation terminated.
lto-wrapper: fatal error: /opt/software/GCC/7.2.0-cuda-9.2.88-offload/libexec/gcc/x86_64-pc-linux-gnu/7.2.0//accel/nvptx-none/mkoffload returned 1 exit status
compilation terminated.
/opt/software/binutils/2.28-GCCcore-6.4.0/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
If I change the private to shared it works fine. I am not new to fortran but know how to program in C/C++ and python. Any advice would be appreciated!