Fortran performance when passing array slices as arguments

Question

I like fortran's array-slicing notation (array(1:n)), but I wonder whether I take a performance hit if I use them when it's not necessary.

Consider, for example, this simple quicksort code (It works, but obviously it's not taking care to pick a good pivot):

recursive subroutine quicksort(array, size)

    real, dimension(:), intent(inout) :: array
    integer, intent(in) :: size
    integer :: p

    if (size > 1) then
        p = partition(array, size, 1)
        call quicksort(array(1:p-1), p-1)
        call quicksort(array(p+1:size), size-p)
    end if

end subroutine quicksort

function partition(array, size, pivotdex) result(p)

    real, dimension(:), intent(inout) :: array
    integer, intent(in) :: size, pivotdex
    real :: pivot
    integer :: i, p

    pivot = array(pivotdex)
    call swap(array(pivotdex), array(size))

    p=1
    do i=1,size-1
        if (array(i) < pivot) then
            call swap(array(i), array(p))
            p=p+1
        end if
    end do

    call swap(array(p), array(size))

end function partition

subroutine swap(a, b)

    real, intent(inout) :: a, b
    real :: temp
    temp = a
    a = b
    b = temp

end subroutine swap

I could just as easily pass the whole array along with the indices of where the recursive parts should be working, but I like the code this way. When I call quicksort(array(1:p-1), p-1), however, does it make a temporary array to operate on, or does it just make a shallow reference structure or something like that? Is this a sufficiently efficient solution?

This question is related, but it seems like it makes temporary arrays because of the strided slice and explicit-sized dummy variables, so I'm safe from that, right?

One comment - why do you pass the size separately? You can also determine that using the `SIZE` intrinsic. — , Mar 17 '15 at 07:12
Not really necessary for the question, but newdex is not defined in your code sample — , Mar 17 '15 at 07:16
Oops, `newdex` should be `p`; I was trying to make the names somewhat logical and couldn't decide which was worse. And you're right, I guess I could have used `SIZE`, but I always forget. Also, the array this would have been called on is larger and only partially filled in the working code, but passing in a slice to the outermost function would solve that, and it really is unnecessary for the `partition` function. Thanks for pointing that out. — krs013, Mar 17 '15 at 07:20

score 4 · Answer 1 · answered Mar 17 '15 at 16:45

Regarding your question of efficiency: Yes, for most cases, using assumed-shape arrays and array slices is indeed a sufficiently efficient solution.

There is some overhead involved. Assumed-shape arrays require an array descriptor (sometimes also called "dope vector"). This array descriptor contains information about dimensions and strides, and setting it up requires some work.

The code in the called procedure with an assume-shape dummy argument has to take both unity stride (the usual case) and non-unity stride into account. Somebody, somewhere, might want to call your sorting routine with an actual argument of somearray(1:100:3) because he only wants to sort every third element of the array. Unusual, but legal. Code that cannot depend on unity stride may have some performance penalty.

Having said that, compilers, especially those using link-time optimization, are quite good nowadays in inlining and/or stripping away all the extra work, and also tend to clone procedures for special-casing unity strides.

So, as a rule, clarity (and assumed-shape arrays) should win. Just keep in mind that the old-fashioned way of passing array arguments may, in some circumstances, gain some extra efficiency.

Does gfortran (or others, if you have that information) take the `contiguous` attribute for assumed shape arguments into account when optimizing? I now try to place it everywhere where it does not cause a temporary. — Vladimir F Героям слава, Mar 17 '15 at 17:16
Yes, gfortran sets the stride to one if you give a dummy argument the `contiguous` attribute, at least since 4.8 (too lazy to check older versions). You can look under the hood, so to speak, by using the -fdump-tree-original flag and inspecting the *.003.original file generated by the compiler. — , Mar 18 '15 at 20:11

score 2 · Accepted Answer · answered Mar 16 '15 at 23:13

2

Your subarray

  array(1:p-1)

is contiguous, provided array is contiguous.

Also, you use an assumed shape array dummy argument

  real, dimension(:), intent(inout) :: array

There is no need for a temporary. Just the descriptor of an assumed shape array is passed. And as your subarray is contiguous, even an assumed size, or explicit size, or assumed size dummy argument with the contiguous attribute would be OK.

answered Mar 16 '15 at 23:13

Vladimir F Героям слава

57,977
4
76
119

Excellent, thanks! That is what I thought, but I wanted to be sure and couldn't find much on the internet... – krs013 Mar 17 '15 at 00:35

Fortran performance when passing array slices as arguments

2 Answers2

Linked