1

I have a fortran program that has the structure given below. This program has been crashing with 'Segmentation fault' error, so I did a lot of digging into the root cause.

It turns out, in the subroutine SUB1, when the DO loop counter variable I reaches a value of 187606, the value of Z(608673), which was originally read as 28.29 from the external file, somehow inexplicably changes to 3.9809702045652999E-309, even though there has been no calculation that would change the values of any of the X, Y or Z arrays after being read from the external file.

I suspect many other array element values are also being set to arbitrary numbers by the program midway through the execution (probably around zero) - though it is impossible to explicitly check for this, given the large size of the array - but in essence this is what is causing the program to crash (these values are used elsewhere in the program).

So the main question is: Why are the array element values being reset arbitrarily by the program, although these arrays are SAVE'd in the MODULE VARIABLES and this module is being USE'd by all the relevant routines? Isn't the SAVE statement meant to preserve the values of the arrays?

Note: I tried running this code on [a] Ubuntu Linux Dell laptop with 16 GB of RAM (no other memory hungry program running) and [b] a Windows laptop with 16 GB RAM. On both these machines I used the gfortran compiler and the same problem occurs. For small values of N, the code seems to run fine - only very large arrays are a problem.

File vars.f90

MODULE VARIABLES

   ......
   ......
   INTEGER :: N
   REAL(8), DIMENSION(:), ALLOCATABLE :: X,Y,Z

   SAVE N
   SAVE X,Y,Z
   .....
   .....
   ... (several other variable declarations) .... 

END MODULE VARIABLES

File allocate.f90

MODULE ALLOCATE_ARRAYS

CONTAINS

SUBROUTINE ALLOC_ARR
   USE VARIABLES
   USE DEALLOCATE_ARRAYS

   .....

   ALLOCATE(X(N),Y(N),Z(N), STAT=IALLOC)
   IF (IALLOC /= 0) THEN
      WRITE(6,*)' ERROR while allocating arrays X Y Z'
      CALL DEALLOC_ARRAYS
      STOP
   ENDIF

   .....
END SUBROUTINE ALLOC_ARR

END MODULE ALLOCATE_ARRAYS

File deallocate.f90

MODULE DEALLOCATE_ARRAYS

CONTAINS

SUBROUTINE DEALLOC_ARR
   USE VARIABLES
   .....

   IF (ALLOCATED(X)) DEALLOCATE(X)
   IF (ALLOCATED(Y)) DEALLOCATE(Y)
   IF (ALLOCATED(X)) DEALLOCATE(X)

   ...
END SUBROUTINE DEALLOC_ARR

END MODULE DEALLOCATE_ARRAYS

File main.f90

PROGRAM PROG1

   USE VARIABLES
   USE ALLOCATE_ARRAYS
   USE DEALLOCATE_ARRAYS

   .....

   N = 18666800      ![in practice this is read from an external input file]
   CALL ALLOC_ARR    !Allocate X,Y,Z each to be of length N

   !Read all values of X(1:N), Y(1:N) and Z(1:N) from an external file] ...
   OPEN(11,FILE=datafile.txt,STATUS='OLD')

   DO I = 1,N
      READ(11,*,IOSTAT=ISTAT) X,Y,Z
         IF ( ISTAT /= 0 ) THEN
            WRITE(6,*)'   *** SERIOUS WARNING ***'
            WRITE(6,*)'   Something went wrong while trying to read numbers X Y Z at I = ',I
            READ(5,*)
         ENDIF
   ENDDO

   CLOSE(11)

   CALL SUB1

   .....
   CALL DEALLOC_ARRAYS

END PROGRAM PROG1


SUBROUTINE SUB1

   USE VARIABLES
   USE ALLOCATE_ARRAYS
   USE DEALLOCATE_ARRAYS

   .....

   DO I = 1,N
      write(6,*)' X(608673)  Y(608673)  Z(608673) = ',X(608673),Y(608673),Z(608673)  ![this statement only inserted for tracing the bug]
      CALL SUB2(I,X(I),Y(I),Z(I))
   ENDDO

END SUBROUTINE SUB1


SUBROUTINE SUB2(IPASS,XPASS,YPASS,ZPASS)

   USE VARIABLES
   USE ALLOCATE_ARRAYS
   USE DEALLOCATE_ARRAYS

   [carry out some calculations that use the values of IPASS, XPASS, YPASS, ZPASS, but never change their values]

END SUBROUTINE SUB2

File datafile.txt (small sample):

.....
.....
10879.544935    1200.249974       28.290163
10914.193168     205.374638      236.847393
23872.837623    3634.498293    23721.923293
.....
.....
  • I often get segmentation fault with (old) Fortran programs and large dimensions, because there is always a stack-allocated array somewhere. Using `ulimit -s unlimited` to avoid limits on stack size helps in such cases (in Linux). Also, running your program with Valgrind may point you to something. – jacob Sep 11 '19 at 20:55
  • @jacob thanks, but that did not solve it, unfortunately. – Curious Leo Sep 11 '19 at 23:16

1 Answers1

0

If the big numbers overflow, then maybe something like the following:

USE ISO_C_BINDING
INTEGER(KIND=C_Int32_t) :: N

Or start by sprinkling some debug statements in so as to ensure that the numbers you are passing actually get received as expected.

If 32 bits is not enough then C_Int64_t can hold bigger numbers.

The REAL(8) can also be as REAL(KIND=C_DOUBLE) using the ISO_C_BINDING, and ISO_C_BINDING is not a bad approach to defining sizes.

Holmz
  • 714
  • 7
  • 14