4

It seems that the runtime behaviour of fftw_mpi_plan_dft_r2c_3d is strongly affect by the first three arguments it takes. The following code is almosted copied from the fftw doc. By setting L to 512 and running 48 processes, it gives segmentation fault, but just change L to 1024 then everything will be fine. I get the same results with my Linux server with fftw-3.3.3 and my mac with fftw-3.3.4.

#include <fftw3-mpi.h>
#include <stdio.h>

int main(int argc, char **argv){   
    const ptrdiff_t L = 512, M = 512, N = 512;
    fftw_plan plan;
    double *rin;
    fftw_complex *cout;
    ptrdiff_t alloc_local, local_n0, local_0_start, i, j, k;
    MPI_Init(&argc, &argv);
    fftw_mpi_init();

    /* get local data size and allocate */
    alloc_local = fftw_mpi_local_size_3d(L,M,N/2+1,MPI_COMM_WORLD,&local_n0, &local_0_start);
    rin = fftw_alloc_real(2 * alloc_local);
    cout = fftw_alloc_complex(alloc_local);
    printf("allocated\n");
    /* create plan for out-of-place r2c DFT */
    plan = fftw_mpi_plan_dft_r2c_3d(L,M,N,rin,cout, MPI_COMM_WORLD,FFTW_MEASURE);
    printf("plan made\n");
    fftw_destroy_plan(plan);
    MPI_Finalize();
}

This is the error on my mac

[MBP:01251] *** Process received signal ***
[MBP:01251] Signal: Segmentation fault: 11 (11)
[MBP:01251] Signal code:  (0)
[MBP:01251] Failing at address: 0x0
[MBP:01251] [ 0] 0   libsystem_platform.dylib            0x00007fff8fcc8f1a _sigtramp + 26
[MBP:01251] [ 1] 0   mca_pml_ob1.so                      0x0000000102eee422 append_frag_to_list + 594
[MBP:01251] *** End of error message ***
Hristo Iliev
  • 72,659
  • 12
  • 135
  • 186
  • On which line of the code is the crash occurring ? – Paul R Apr 14 '15 at 08:54
  • @PaulR fftw_mpi_plan_dft_r2c_3d – user4286578 Apr 14 '15 at 08:56
  • Add some error checking to make sure that the rin, cout allocations are successful (i.e. not NULL) ? – Paul R Apr 14 '15 at 08:59
  • @PaulR I did actually. Even with L=1024 it worked out fine so there couldn't be problem with the fftw_alloc. – user4286578 Apr 14 '15 at 09:02
  • 2
    It looks like an error in Open MPI. Please report it to the Open MPI developers [here](http://www.open-mpi.org/community/lists/ompi.php) (post to the **User list**) Or it could be FFTW incorrectly computing some kind of Scatterv distribution resulting in overlapping segments. In any case, ask the Open MPI guys. – Hristo Iliev Apr 14 '15 at 09:08
  • This problem can be reproduced (I used L=80;M=256;N=256; and 9 processes). The program may fail or runs correctly. Multiplying the alloc_local by 4 does not solve the issue. It fails at `fftw_mpi_plan_dft_r2c()`, even if the flag `FFTW_ESTIMATE` is used. As this code is really close to the one [provided by fftw](http://www.fftw.org/doc/Multi_002ddimensional-MPI-DFTs-of-Real-Data.html#Multi_002ddimensional-MPI-DFTs-of-Real-Data), there might be a bug in fftw, as @HristoIliev wrote. – francis Apr 14 '15 at 09:24
  • Had the same problem and found it reported as a bug [here](https://github.com/FFTW/fftw3/issues/43). A "fix" was noted in a comment in this thread: *"to be safe if you restrict the number of processors to the case when the first dimension of the grid is divisible evenly by the number of processors you will be fine."*. I checked this and it works fine for me. – Winther Feb 08 '17 at 13:47

0 Answers0