3
program test

    implicit none
    real(dp),allocatable :: a(:), b(:)

    allocate (a(10))

    a = 1.d0
(*) b = a

end program

In the above code, I set two allocatables - a and b - and only allocated a in the code. I expected the code to fail to be compiled, but it is compiled well and it seems to work well while the code below, which does a similar job,shows SEGFAULT.

program test_fail

   implicit none
   real(dp),allocatable :: a(:), b(:)
   integer              :: i

   allocate ( a(10) )

   a = 1.d0

   do i = 1, 10
      b(i) = a(i)
   enddo

end program

Is it fine to understand that the former code allocates b automatically?

subroutine test_sub(a)

   implicit none

   real(dp),intent(inout) :: a(:,:)

   ...

end program

And also in above subroutine with shaped-array input, is it fine to understand that the code automatically detects the size of input array a, and allocates its own array in the subroutine and deallocates it returning to its upper frame?

And for the last, which is faster when I copy an array to another array?

program speed1

  implicit none
  real(dp), allocatable :: a(:,:,:), b(:,:,:)

  allocate( a(10000,10000,10000) )

  a = 1.d0

  b = a

end program
program speed2

  implicit none
  real(dp), allocatable :: a(:,:,:), b(:,:,:)
  integer               :: i, j, k

  allocate( a(10000,10000,10000), b(10000,10000,10000) )

  a = 1.d0

  do i = 1,10000
    do j = 1,10000
      do k = 1,10000
        b(i,j,k) = a(i,j,k)
      enddo
    enddo
  enddo

end program
Sangjun Lee
  • 406
  • 2
  • 12
  • 1
    As you have your loops the wrong way around b=a is probably going to be quicker (assuming you fix the allocation issue in speed2). If you order the loops the other way around I would hope they take the same time, but if the bottleneck in your code is zeroing an array it is a very strange code! – Ian Bush Jun 23 '20 at 12:18
  • 6
    Note that this is one time when it's important to note whether it's Fortran 90, or free-form (introduced in F90): the rules around that assignment of `b` [changed in Fortran 2003](https://stackoverflow.com/q/42140832/3157076), and behaviour may depend on compiler version and/or compile-time flags. – francescalus Jun 23 '20 at 12:18
  • @IanBush What should I fix in the ```speed2``` ? And I've tried a few combinations of ```i,j,k``` orders, but all of them was slower than ```b = a``` statement. If it were for 2D case, with the fortran's column major ordering, I expect the order of ```j,i``` would be faster. But with 3D case, what ordering of ```i,j,k``` is the optimal one for the fortran? In my trials, ```i,j,k``` were the fastest. – Sangjun Lee Jun 24 '20 at 06:02
  • 1
    An array is automatically allocated at array assignment, because the shape is then known. You can not index an unallocated array, which causes segfault. When it comes to loop order in fortran, the left-most index should be in inner-most loop, like `do k; do j; do i; arr(i,j,k)`. But just use `b=a` it's simple and clear, and probably equivalent to correctly ordered loops from the compilers point of view. – Jonatan Öström Jun 30 '20 at 00:01
  • @JonatanÖström Thank you for the clear explanation :) – Sangjun Lee Jul 01 '20 at 13:13

3 Answers3

4

I don't have much time, so this might be rather compressed. Note too that, as pointed out in a comment above (copied here for prominence and for posterity) the validity of the code, and of this answer, is version dependent (and see this question for more details).

As you have found, the statement

b = a

will automatically allocate the array b to the same size (and shape) as a and give its elements the same values as the elements of a. That's all in line with the standard. However, the statement

b(i) = a(i)

does not have an array on the left, it has an array element, and Fortran won't automatically allocate an array in this case. It's unfortunate, but this is (I believe) one of those circumstances where the compiler is not required, by the language standard, to spot the mistake - which is why you find out at run-time. Your compiler seems to behave as the other compilers I have recent experience of - there is, effectively, no element b(1) to assign the value of a(1) to, etc.

As for the 'allocation' in the subroutine with the statement

real(dp),intent(inout) :: a(:,:) 

that's not quite the same. Here a is assumed-shape and simply assumes the shape (and values) of the corresponding argument passed to the routine. The standard is silent on how this is done. Most compilers will not make a copy of the array (for performance reasons which are generally important to Fortran programmers) but some might, and it is easy to construct examples where most compilers will make copies of data structures passed as arguments.

As for which of your final two codelets is faster - why don't you tell us ?

High Performance Mark
  • 77,191
  • 7
  • 105
  • 161
3

The answer by High Performance Mark is correct and helpful, but there may be some advantage to restating some of the points in a slightly different way.

As previously mentioned in that answer, and in the linked question, assignment like

b = a

for b not allocated depends on whether Fortran 2003+ rules are used. However, if b is not allocated then assignment to an element of b, as in,

b(i) = a(i)

is always wrong (up to current language revisions). Whenever an array element appears on the left-hand side of an assignment (or in a variable reference), the value of the subscript must be within the bounds of the array - but an allocatable array which is not allocated doesn't have defined bounds.1

This contrasts with languages such as Matlab where arrays "grow" if there is an assignment beyond the current range of an array's size. Fortran (2003+) whole-array assignment is an exception where (re-)allocation occurs with the whole array taking on the right-hand side in completeness. On such whole-array assignment the required size is known; with element-by-element assignment the eventual allocation is initially unknown.

Regarding the dummy argument a of the subroutine test_sub there's a little to add to High Performance Mark's answer: argument association means that a in the subroutine may share certain aspects with an array in the main program which calls the subroutine. That's indeed likely to include any allocation/storage which has previously been given.

For completeness, though: Fortran is one of those languages where semantics are not defined line-by-line. The ... of the subroutine hide things other people may be interested in. Inside that ... we may change this interpretation of a.

subroutine  test_sub(a)
   real(dp),intent(inout) :: a(:,:)
   allocatable :: a  ! or "pointer :: a"
end program

Suddenly a is not assumed-shape but deferred-shape. I won't go in to the consequences here, but they can be found in other questions and answers.

Finally, on speed of assignment we come to the comparison of whole-array assignment and looping over all elements of arrays (with left-hand side correctly allocated). The rules of Fortran guarantee that with an array on the left-hand side the assignment is "performed element-by-element on corresponding array elements" but that the "processor may perform the element-by-element assignment in any order".

Your assignments will (in most cases) have the same effect, but a compiler may reorder or take advantage of vectorization/parallelization as it sees fit. Equally, however, it can do exactly the same analysis with the loop (including re-ordering the loops, as Ian Bush's comment suggests may be helpful, or in response to things such as OpenMP directives). If you want to know precise details of performance in your case you'll have to test, but if you micro-optimize with a hand-crafted loop you'll annoy someone eventually, just as surely as you'll annoy someone else with whole-array assignment.


1For the deeply interested party, see Fortran 2018 8.5.8.4 and 9.5.3.1.

francescalus
  • 30,576
  • 16
  • 61
  • 96
1

Let me extend my comment as an answer.

1st code snipped compiles because b takes the shape of a at assignment b=a, which is a Fortran-standard feature

2nd snippet is wrong: an unallocated array cannot be indexed, it has no shape or entries. Therefor segfault.

(In this simple case the compiler could easily detect this mistake at compile-time and give an error. But allocatable arrays can be passed between subroutines and allocated/deallocated there, making it hard to detect the mistake in the general case. Which is why, I guess, this check is not done.)

3rd snippet: An assumed-shape array a(:,:) in subroutine/function arguments, takes the shape and values of the array being passed to the routine. It will (usually) be a reference to the same memory.

4th and 5th snippet: Fortran arrays are column-major meaning that the left-most index should be inner-most loop etc for fastest execution. Like:

do k = 1, n
    do j = 1, n
        do i = 1, n
            b(i,j,k) = a(i,j,k)

However, in simple cases the compiler will fix this when optimization is applied (for example with gfortran -O3).

Since bis unused, it might be that the compiler removes the copy-operation in both cases.

Copying is rarely a bottle-neck, so I would use b=a regardless of performance, since it's clear and short. Anyway the loop above and a=b might just be equivalent to the compiler.

Jonatan Öström
  • 2,428
  • 1
  • 16
  • 27