4

Consider the following Fortran program

program test_prg
  use iso_fortran_env, only : real64
  use mpi_f08

  implicit none
  real(real64), allocatable :: arr_send(:), arr_recv(:)
  integer :: ierr

  call MPI_Init(ierr)
  allocate(arr_send(3), arr_recv(3))
  arr_send = 1
  print *, lbound(arr_recv)
  call MPI_Gatherv(arr_send, size(arr_send), MPI_DOUBLE_PRECISION, arr_recv, [size(arr_send)], [0], MPI_DOUBLE_PRECISION, 0, MPI_COMM_WORLD, ierr)
  print *, lbound(arr_recv)
  call MPI_Finalize(ierr)
end program

Execution of this program on 1 processor (compiled with gfortran 9.3.0 and mpich 3.3.2), prints:

       1
       0

So arr_recv has changed its lower bound after the call to MPI_Gatherv. If I use arr_recv(1) instead of arr_recv in the call to MPI_Gatherv, then it doesn't change. If I replace mpi_f08 module with mpi, then using either arr_recv(1) or arr_recv doesn't change the lower bound.

Why is lower bound changing in this program?

QNA
  • 1,047
  • 2
  • 14
  • 26
  • Please try the most recent version of gfortran. This looks like an array descriptor or a binding headers issue. – Vladimir F Героям слава Sep 10 '20 at 06:51
  • I am not even sure this is a valid MPI program since the receive buffer is being overwritten by buffers from all ranks. – Gilles Gouaillardet Sep 10 '20 at 07:17
  • 1
    @GillesGouaillardet he's getting this on one process, where that won't be an issue – Ian Bush Sep 10 '20 at 07:20
  • @IanBush good catch! – Gilles Gouaillardet Sep 10 '20 at 07:21
  • @VladimirF The same thing happens with gfortran 10.0.1. I'm compiling version 10.2 now. – QNA Sep 10 '20 at 17:57
  • FWIW, it works with Open MPI and gcc 10.2.0. – Gilles Gouaillardet Sep 10 '20 at 23:05
  • Which MPICH version are you running? – Gilles Gouaillardet Sep 10 '20 at 23:05
  • @GillesGouaillardet 3.3.2 – QNA Sep 10 '20 at 23:49
  • @VladimirF Same thing on 10.2 – QNA Sep 11 '20 at 02:47
  • What is your `configure` command line? fwiw, i get a `MPI_Type_create_hindexed()` error because of a negative count (!) with gcc 10.2 – Gilles Gouaillardet Sep 11 '20 at 11:54
  • @GillesGouaillardet Do you mean `configure` for the gcc 10.2 build? `configure --prefix=/home/username/gcc-10.2 --disable-multilib --with-system-zlib --enable-languages=c,c++,fortran` – QNA Sep 11 '20 at 15:24
  • Nope, configure for MPICH – Gilles Gouaillardet Sep 12 '20 at 00:18
  • @GillesGouaillardet I didn't compile MPICH from sources, I just `apt install`'ed it. – QNA Sep 12 '20 at 01:51
  • which distro are you running? note mpich must have been built with a `gfortran` version that is interoperable with the `gfortran` you use to build your MPI app, at least if you `use mpi` or `use mpi_f08` since there is generally no interoperability of Fortran modules (`.mod`) between compilers with different version. – Gilles Gouaillardet Sep 12 '20 at 06:53
  • @GillesGouaillardet I don't know how to check it. One of the default ones on Ubuntu 20. I'll try to rebuild MPICH with gfortran 10.2 then. – QNA Sep 12 '20 at 17:03
  • @GillesGouaillardet I wasn't able to compile MPICH from sources. Their build script does not work on my machine: first, it cannot find libbacktrace, then, when I specifically point out its location, it starts complaining that the functions exported by that lib have already been defined in another library. Not sure what to do about it. – QNA Sep 12 '20 at 21:40
  • @GillesGouaillardet I was able to compile it with gfortran 10.0.1, but the result is the same – QNA Sep 12 '20 at 22:10
  • @GillesGouaillardet In your answer you mentioned an "assumed rank" (i.e. `dimension(..)`), but in the question I don't see an assumed rank or does `MPI_Gatherv` of MPICH have an assumed rank? – albert Sep 13 '20 at 09:00
  • 1
    the Fortran 2008 binding (in MPICH) aka `use mpi_f08` for `MPI_Gatherv()` use assumed rank (e.g. `dimension(..)`) – Gilles Gouaillardet Sep 13 '20 at 09:14
  • @GillesGouaillardet I think the difference between the `use mpi_f08` and `use mpi` is worth more than just a comment as it is (in this case due to a possible compiler bug for gfortran) the base of the problem. – albert Sep 13 '20 at 09:32
  • 1
    though `use mpi_f08` vs `use mpi` was stated in the question, I updated my answer to make the scope of the compiler bug crystal clear. – Gilles Gouaillardet Sep 13 '20 at 09:41

2 Answers2

3

At this stage, I believe this is a bug in gfortran affecting the MPI Fortran 2018 bindings (e.g. use mpi_f08) and I reported it at https://gcc.gnu.org/pipermail/fortran/2020-September/055068.html. All gfortran versions are affected (I tried 9.2.0, 10.2.0 and the latest master branch, versions 8 and earlier do not support dimension(..).

The reproducer below can be used to evidence the issue

MODULE FOO
INTERFACE
SUBROUTINE dummyc(x0) BIND(C, name="sync")
type(*), dimension(..) :: x0
END SUBROUTINE
END INTERFACE
contains
SUBROUTINE dummy(x0)
type(*), dimension(..) :: x0
call dummyc(x0)
END SUBROUTINE
END MODULE

PROGRAM main
    USE FOO
    IMPLICIT NONE
    integer :: before(2), after(2)

    INTEGER, parameter :: n = 1

    DOUBLE PRECISION, ALLOCATABLE :: buf(:)
    DOUBLE PRECISION :: buf2(n)

    ALLOCATE(buf(n))
    before(1) = LBOUND(buf,1)
    before(2) = UBOUND(buf,1)
    CALL dummy (buf)
    after(1) = LBOUND(buf,1)
    after(2) = UBOUND(buf,1)

    if (before(1) .NE. after(1)) stop 1
    if (before(2) .NE. after(2)) stop 2

    before(1) = LBOUND(buf2,1)
    before(2) = UBOUND(buf2,1)
    CALL dummy (buf2)
    after(1) = LBOUND(buf2,1)
    after(2) = LBOUND(buf2,1)

    if (before(1) .NE. after(1)) stop 3
    if (before(2) .NE. after(2)) stop 4

END PROGRAM

FWIW, Intel ifort compiler (I tried 18.0.5) works fine with the reproducer.

Gilles Gouaillardet
  • 8,193
  • 11
  • 24
  • 30
  • Nice answer. When I had a look I thought what is `dimension(..)`, so I read up in the newest 2019 Fortran standard I have (draft of 28th December 2017 11:02) and I saw that this is a new feature of Fortran 2019 called "Assumed-rank entity" (paragraph 8.5.8.7). – albert Sep 13 '20 at 08:53
  • Gilles Can you mention in your answer the version of the Intel compiler you used and the gfortran version (as it is a new feature in Fortran I think it is good to know the versions of the compilers as well). – albert Sep 13 '20 at 08:55
  • 1
    `dimension(..)` and (improved) `C` interop were introduced in the `Fortran 2018` standard. I edited my answer with the compiler versions I tested. TL;DR all `gfortran` versions are affected, I guess all `ifort` versions are fine. – Gilles Gouaillardet Sep 13 '20 at 09:20
  • @GillesGouaillardet could you provide implementation details of `sync` in bugzilla? – QNA Sep 14 '20 at 17:11
  • 1
    updated. Crediting the author of a minimal reproducer is always appreciated – Gilles Gouaillardet Sep 15 '20 at 00:00
0

This question is a bit old, but I had the same issue on the last days with GNU Fortran (GCC) 11.2.0 when moving from use mpi to use mpi_f08 - but in my case, it was with MPI_Allreduce. It's clearly a bug on the MPI functions, but one workaround is to send the argument with the bounds defined, as var(1:end). This worked for me. In your case, you could try:

call MPI_Gatherv(arr_send(1:3), size(arr_send(1:3)), MPI_DOUBLE_PRECISION, arr_recv(1:3), [size(arr_send(1:3))], [0], MPI_DOUBLE_PRECISION, 0, MPI_COMM_WORLD, ierr)
Filipe
  • 532
  • 4
  • 16
  • According to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97046, this is supposed to be fixed in gcc 12 – QNA Apr 02 '22 at 04:32
  • That is good to know, @DartLenin, thanks! But until then (and even after), the workaround I’m using should do the trick (may be a bit of work for long codes). – Filipe Apr 03 '22 at 08:17