0

I'm trying to use allocatable arrays inside "device" data structures that reside in GPU memory. Code (pasted below) compiles, but gives a segfault. Am I doing something obviously wrong?

Module file is called 'gpu_modules.F90', given below:

!=============
! This module contains definitions for data structures and the data
! stored on the device
!=============

   module GPU_variables
   use cudafor

   type :: data_str_def

!=============
! single number quantities
!=============

      integer                       :: i, j 
      real(kind=8)                  :: a 

!=============
! Arrays
!=============

      real(kind=8),   allocatable   :: b(:)
      real(kind=8),   allocatable   :: c(:,:)
      real(kind=8),   allocatable   :: d(:,:,:)
      real(kind=8),   allocatable   :: e(:,:,:,:)

   end type data_str_def

!=============
! Actual data is here
!=============

   type(data_str_def), device, allocatable   :: data_str(:)

   contains

!=============
! subroutine to allocate memory
!=============

      subroutine allocate_mem(n1)
      implicit none 
      integer, intent(in)  :: n1 

      call deallocate_mem()

      write(*,*) 'works here'
      allocate(data_str(n1))

      write(*,*) 'what about allocating memory?'
      allocate(data_str(n1) % b(10))
      write(*,*) 'success!'

      return
      end subroutine allocate_mem

!=============
! subroutine to deallocate memory
!=============

      subroutine deallocate_mem()
      implicit none
      if(allocated(data_str)) deallocate(data_str)
      return 
      end subroutine deallocate_mem

   end module GPU_variables

Main program is 'gpu_test.F90', given below:

!=============
! main program 
!=============

    program gpu_test
    use gpu_variables
    implicit none

!=============
! local variables
!=============

    integer             :: i, j, n

!=============
! allocate data
!=============

    n       = 2                 ! number of data structures

    call allocate_mem(n)

!=============
! dallocate device data structures and exit
!=============

    call deallocate_mem()
    end program

Compilation command (from current folder) is:

pgfortran -Mcuda=cc5x *.F90 

Terminal output:

$ ./a.out 
 works here
 what about allocating memory?
Segmentation fault (core dumped)

Any help/insight and solution would be appreciated.. and no, use of pointers is not a viable option.

Edit: another detail that may be relevant: I'm using pgfortran version 16.10

ansri
  • 37
  • 6
  • Note using `kind=8` is ugly and not portable (although it does not cause this error). Also, return before every end is completely superficial. – Vladimir F Героям слава Jul 21 '17 at 09:00
  • Also note that `allocate(data_str(n1) % b(10))` allocates the `b` component only for the `n1`th component of `data_str`. But that may be your intention in this simple example. – Vladimir F Героям слава Jul 21 '17 at 09:02
  • Possible duplicate https://stackoverflow.com/questions/44680150/how-to-allocate-arrays-of-arrays-in-structure-with-cuda-fortran – Vladimir F Героям слава Jul 21 '17 at 09:04
  • hi Vladimir: thanks for the replies. kind=8 was only to make things explicitly clear. Yes, i wanted to allocate only one component of the derived type, without touching the rest. I looked first at the "pointer in data structure" question too - the suggested solution was to make a host-side copy of the data structure and copy the entire thing over to the device... I'll try it out and post a reply – ansri Jul 21 '17 at 21:36
  • just tried it.. no luck making a host copy and transferring to device. – ansri Jul 21 '17 at 21:42
  • so according to the documentation (PGI openacc guide, v2015 and v2017): Arrays of derived type, where the derived type contains allocatable members, have not been tested and should not be considered supported for this release. That important feature will be included in an upcoming release ==> not sure if it refers to the openacc feature itself or the compiler in general. In any case, it seems to be an unstable feature. I guess I'll have to manually create it using pointers – ansri Jul 21 '17 at 21:51
  • I think you misunderstand the quote. OpenACC plays no role here. – Vladimir F Героям слава Jul 22 '17 at 08:04
  • You should show us what you tried and which error did you get. Please read the complete discussion in https://stackoverflow.com/questions/44680150/how-to-allocate-arrays-of-arrays-in-structure-with-cuda-fortran again. I posted there some important comments below the talonmies's myself. Please show your new code. – Vladimir F Героям слава Jul 22 '17 at 08:13
  • I am also sure you can make your code sample much shorter. A few lines should suffice here. – Vladimir F Героям слава Jul 22 '17 at 08:15

2 Answers2

1

The reason for the segmentation fault is that you have to access the memory for data_str on the host in order to allocate data_str(n1)%b. Since data_str is in device memory, not host memory, you get the segmentation fault. In theory, the compiler could create a host temp, allocate it, and then copy it to descriptor for data_str(n1)%b, but that's not part of today's CUDA Fortran.

You can work around this case by creating the temp yourself:

      subroutine allocate_mem(n1)
      implicit none
      integer, intent(in)  :: n1
      type(data_str_def) :: data_str_h

      call deallocate_mem()

      write(*,*) 'works here'
      allocate(data_str(n1))

      write(*,*) 'what about allocating memory?'
      allocate(data_str_h% b(10))
      data_str(n1) = data_str_h
      write(*,*) 'success!'

      return
      end subroutine allocate_mem

BTW, is your intention that components b, c, d, and e are allocated in host memory or device memory? I don't see the device attribute on them, so in the above, they'd go to host memory.

Rafik Zurob
  • 361
  • 2
  • 6
  • hi Rafik, thanks for the suggestion. The data structure in the module "data_str" has the device attribute, so it resides in GPU memory. Any entries in "data_str" also inherit this attribute. – ansri Jul 30 '17 at 00:07
  • @ansri *"Any entries in "data_str" also inherit this attribute."* That is not true! See the link I have already shown you https://stackoverflow.com/questions/44680150/how-to-allocate-arrays-of-arrays-in-structure-with-cuda-fortran – Vladimir F Героям слава Jul 31 '17 at 20:40
1

So I posted this question on the PGI forums, and a guy from PGI confirms that the feature is not supported as I'm trying to use it.

http://www.pgroup.com/userforum/viewtopic.php?t=5661

His recommendation was to use the "managed" attribute or use fixed-sized arrays inside the data structure.

ansri
  • 37
  • 6