4

Are there tools that can be used to check which arguments of a fortran procedure is being defined or not inside the procedure? I mean something like a lexical analyzer that simply check if a variable is being used on the left hand side of an assignment (or equivalent) statement. Something similar to the check that the compiler does when the intent(in) are specified for an argument.

I came across a code written mostly in fortran 77 standard (no intent specified), with subroutines that have hundreds of arguments, some of those subroutines extend over 5000 of lines of code each. I would like to modify some part of the code and rewrite the long subroutines. I think that if I am able to trace the arguments that are being changed or not, it will be easy.

Any suggestion is welcome.

Just to define the limits of my question and avoid useless discussion: I am aware that variables can be modified by calls to other subroutines. I can handle that one manually if there is a tool that can check the direct modifications inside a given procedure.

innoSPG
  • 4,588
  • 1
  • 29
  • 42
  • 2
    I'm not aware of such a tool, but can you just plonk down `intent(in)` on each and see where the compiler complains? Yes, it isn't foolproof or automatable, but if you just need a quick hack... – francescalus Aug 11 '16 at 21:13
  • 1
    I fear that many compilers might not catch it if they are in different source files. – Vladimir F Героям слава Aug 11 '16 at 21:15
  • The number of variable definition contexts in F77 is tiny compared with modern code. So, do you care about all definition possibilities in all code, or just assignment LHS or just those F77? – francescalus Aug 11 '16 at 21:16
  • I retagged the question. lexical-analysis is what a lexer does, creating stream of tokens from characters. – Vladimir F Героям слава Aug 11 '16 at 21:17
  • @VladimirF, I am exactly in the case of your comment about different source files. I have many source files. – innoSPG Aug 12 '16 at 03:18
  • @francescalus, If there is a solution that catches the LHS assignment for a starting point, I will be happy with that. Most of the changes seem to be in LHS. – innoSPG Aug 12 '16 at 03:24
  • "Modification" and "definition" are really execution concepts - consider statements that may be conditionally executed. When parsing source (or performing semantic analysis of source) the relevant concept might be a "variable definition context" referred to by francescalus (which does not include all contexts that might result in definition of a variable at execution time). I am aware of tools that may be able to report objects that appear in variable definition contexts. – IanH Aug 12 '16 at 07:30
  • @IanH, please give me the name of those tools that are able to report objects appearing in variable definition context. That is the sort of tools I am after. – innoSPG Aug 12 '16 at 11:53
  • 1
    **IF** your source is **conforming** F2008 (which includes most of F77) and **IF** you have access to a Fortran 2003 compiler (which does not include current gfortran) then [here is the source for a little parsing utility](http://www.megms.com.au/download/VariableDefinitionContext.zip) I put together that does this (approximately - to do it properly requires full semantic analysis of the code). Other Fortran parsing toolsets should be able to be modified to do something similar. Further discussion about my example best done elsewhere - c.l.f or email. – IanH Aug 12 '16 at 14:52
  • What about passing constant as actual arguments? Hopefully, the tool will crash when the subroutine tries to modify them :) Not the best solution I guess, but could you try? – Alexander Vogt Aug 12 '16 at 15:47
  • @AlexanderVogt, that is the initial approach I was using. It is horrible with subroutines having hundreds of arguments. I will check the tool from IanH. – innoSPG Aug 12 '16 at 16:12
  • @IanH, where can get your email address? I got your code, I will give it a try and come back to you. Also, what is c.I.f? – innoSPG Aug 12 '16 at 16:14
  • Try ff08@megms.com.au c.l.f. is shorthand for comp.lang.fortran – IanH Aug 12 '16 at 21:20
  • INTENT(IN) is you friend here. PURE could also be potentially helpful. I believe you may have some work ahead of you... – Holmz Aug 13 '16 at 01:18
  • @Holmz, INTENT(IN) has definitely been helpful; I have been using it along with using constants as actual arguments. PURE however is out of hope with the bunch of global variables in here. – innoSPG Aug 13 '16 at 02:59
  • I compiled Ian's VDC package (it is very large!!) and it worked for me with ifort-16 (I finally got access to it recently...). I also tried to follow the mechanism but still not in success... It seems that it is necessary to learn user-defined type I/O, first of all. – roygvib Aug 14 '16 at 17:21

1 Answers1

4

For convenience, here is a script for compiling Ian's VariableDefinitionContext package. It was compiled successfully with ifort-16.0 with -standard-semantics (gfortran-6.1 and ifort-14 were not able to compile it with insufficient syntax support...)

#!/usr/bin/env python
from __future__ import print_function
import os

with open( "compile-order.txt", "r" ) as f:
    tmp = f.read()
allsrc = tmp.split()

#cmd = "ifort -standard-semantics -warn -check all"
cmd = "ifort -standard-semantics"

obj = ""
for src in allsrc:
    print( "compiling", src )
    os.system( ( cmd + " -c %s" ) % ( src ) )
    obj += src[ :-4 ]+ ".o "
os.system( ( cmd + " %s" ) % ( obj ) )
# Usage: ./a.out test.f90

...but it turned out that the following command can do the same job!! (thanks to @IanH)

$ ifort -standard-semantics @compile-order.txt

FWIW, here is another Python script for printing (potentially) modified variables. This script searches for various symbols in gfortran dump files. In contrast to Ian's package, only a minimal set of Fortran syntax is considered (direct assignment plus basic read/write statements etc).

One potential usage of this kind of script is to find potentially modified COMMON variables. Previously, I had a hard experience for modifying legacy Fortran programs with tons of COMMON blocks, so it might be useful for such situations...

#!/usr/bin/env python

from __future__ import print_function
import os, sys

def pushuniq( coll, item ):
    if not ( item in coll ): coll.append( item )

def getvarname( s, proc ):
    try:
        return s.split( proc + ":" )[ 1 ].split("(")[ 0 ].split("%")[ 0 ]
    except:  # ad-hoc!
        return s.split("(")[ 0 ].split("%")[ 0 ]

#------------------------------------------------------------------------
def varcheck( filename, Qwritedump=False ):
    """
    checks and prints potentially modified variables.
    Usage: varcheck.py <filenames>
    Set Qwritedump=True to write dump files.
    """
    #.........................................................
    # Generate gfortran dump file

    cmd = "gfortran -fdump-parse-tree -c %s"          # gfort >=4.7
    # cmd = "gfortran -fdump-fortran-original -c %s"  # gfort >=5

    with os.popen( cmd % ( filename ) ) as p:
        lines = p.readlines()

    base = '.'.join( filename.split('.')[:-1] )
    os.system( "rm -f %s.{o,mod}" % ( base ) )   # remove .o and .mod

    if Qwritedump:
        with open( "%s.dump" % ( filename ), "w" ) as f:
            f.write( ''.join( lines ) )
    #/

    #.........................................................
    # List of variables

    varlist = {}    # (potentially) modified variables
    arglist = {}    # dummy arguments
    comlist = {}    # common variables
    modlist = {}    # module variables
    reslist = {}    # result variables
    sublist = {}    # child subroutines
    namlist = {}    # namelists

    #.........................................................
    # Scan the dump file

    Qread = False
    Qgetarg = False

    for line in lines:

        word = line.split()
        if len( word ) == 0 : continue                # skip blank lines
        if word[ 0 ].isdigit() : word = word[ 1: ]    # remove line numbers

        key = word[ 0 ]

        if key == "Namespace:" : continue

        if key == "procedure":
            proc = word[ -1 ]

            varlist[ proc ] = []
            arglist[ proc ] = []
            comlist[ proc ] = []
            modlist[ proc ] = []
            reslist[ proc ] = []
            namlist[ proc ] = []
            sublist[ proc ] = []
            continue

        if key == "common:": continue
        if key == "symtree:": sym = word[ 1 ].strip("'").lower()

        # result variable
        if ( sym == proc ) and ( key == "result:" ):
            reslist[ proc ].append( word[ 1 ] )

        # dummy arguments
        if "DUMMY" in line:
            arglist[ proc ].append( sym )

        # common variables
        if "IN-COMMON" in line:
            comlist[ proc ].append( sym )

        # module variables
        if ( "VARIABLE" in line ) and ( "USE-ASSOC" in line ):
            modlist[ proc ].append( sym )

        # child subroutines
        if key == "CALL":
            pushuniq( sublist[ proc ], word[ 1 ] )

        # namelists
        if ( key == "READ" ) and ( "NML=" in line ):
            namlist[ proc ].append( word[ -1 ].split("NML=")[ -1 ] )

        # iostat
        if "IOSTAT=" in line:
            tmp = line.split("IOSTAT=")[ 1 ].split()[ 0 ]
            sym = getvarname( tmp, proc )
            pushuniq( varlist[ proc ], (sym, "iostat") )
        #/

        def addmemvar( op ):
            for v in word[ 1: ]:
                if proc in v:
                    sym = getvarname( v, proc )
                    pushuniq( varlist[ proc ], (sym, op) )

        # allocation
        if key == "ALLOCATE"    : addmemvar( "alloc" )
        if key == "DEALLOCATE"  : addmemvar( "dealloc" )
        if "move_alloc" in line : addmemvar( "move_alloc" )

        # search for modified variables
        if key == "READ"   : Qread = True
        if key == "DT_END" : Qread = False

        if ( key == "ASSIGN" ) or \
           ( Qread and ( key == "TRANSFER" ) ) or \
           ( key == "WRITE" and ( proc in word[ 1 ] ) ):

            if key == "ASSIGN"   : code = "assign"
            if key == "WRITE"    : code = "write"
            if key == "TRANSFER" : code = "read"

            sym = getvarname( word[ 1 ], proc )
            pushuniq( varlist[ proc ], (sym, code) )
        #/
    #/

    all_lists = { "var": varlist, "arg": arglist, "com": comlist,
                  "mod": modlist, "res": reslist, "sub": sublist,
                  "nam": namlist }

    #.........................................................
    # Print results

    for proc in varlist.keys():
        print( "-" * 60 )
        print( proc + ":" )

        for tag in [ "arg", "com", "mod", "res" ]:

            if tag == "arg":
                print( "    " + tag + ":", arglist[ proc ] )
            else:
                print( "    " + tag + ":" )

            for (sym, code) in varlist[ proc ]:
                if sym in all_lists[ tag ][ proc ]:
                    print( "        %-10s  (%s)" % (sym, code) )
            #/
        #/

        print( "    misc:" )
        for (sym, code) in varlist[ proc ]:
            if ":" in sym:
                print( "        %-10s  (%s)" % (sym, code) )
        #/

        print( "    call:", sublist[ proc ] )

        if len( namlist[ proc ] ) > 0:
            print( "    namelist:", namlist[ proc ] )
    #/

    return all_lists
#/

#------------------------------------------------------------------------
if __name__ == "__main__":

    if len( sys.argv ) == 1:
        sys.exit( "Usage: varcheck.py <filenames>" )
    else:
        filenames = sys.argv[ 1: ]

    for filename in filenames:
        varcheck( filename )
#/

Example 1: LAPACK zheev

$ ./varcheck.py zheev.f
------------------------------------------------------------
zheev:
    arg: ['jobz', 'uplo', 'n', 'a', 'lda', 'w', 'work', 'lwork', 'rwork', 'info']
        info        (assign)
        work        (assign)
        w           (assign)
        a           (assign)
    com:
    mod:
    res:
    call: ['xerbla', 'zlascl', 'zhetrd', 'dsterf', 'zungtr', 'zsteqr', 'dscal']

Example 2: Simple test program

!--------------------------------------------------------
module myvar
    character(50) :: str
    type mytype
        integer :: n
    endtype
    type(mytype) :: obj
    integer :: foo
end

!--------------------------------------------------------
subroutine mysub( a, b, c, ios, n, p, q, r )
    use myvar
    dimension b(10)
    common /com1/ dat( 50 ), x, y, z, wtf(1000)
    common /com2/ dat2( 50 )
    integer inp, out
    namelist /list/ str
    namelist /list2/ p, q
    inp = 10 ; out = 20

    open( inp, file="test.dat", status="old", iostat=ios10 )
    read( inp, *, iostat=ios ) a, ( b(i), i=3,5 )
    write( out, * ) "hello"
    read( inp, * ) c
    read( inp, list )
    close( inp, iostat=ios30 )

    write( str, "(f8.3)" ) a + c

    do i = 1, n
        dat( i ) = b( i )
    enddo
    x = p + q
    y = x * 2
100 c = dat( 1 ) + x + y
end

!--------------------------------------------------------
subroutine mysub2( &
        a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, &
        b1, b2, b3, b4, b5, b6, b7, b8, b9, b10, &
        c1, c2, c3, c4, c5, c6, c7, c8, c9, c10 )
    a3 = 3.0
    b5 = 5.0
end

!--------------------------------------------------------
function myfunc( x, m )
    common /com2/ dat2(50)
    common /com3/ dat3(50)
100 myfunc = x + dat2( m )
200 m = 5
    where( dat2 < 1.0 ) dat2 = 500.0
end

!--------------------------------------------------------
function myfunc2() result( res )
    use myvar
    implicit none
    integer :: res
    obj % n = 500
    res = obj % n
    call sub2( res )
end

!--------------------------------------------------------
subroutine myalloc( a, ier )
    implicit none
    integer, allocatable :: a(:), b(:)
    integer ier
    allocate( a( 10 ), b( 20 ), source=0, stat=ier )
end

!--------------------------------------------------------
subroutine mydealloc( a, b, ier )
    implicit none
    integer, allocatable :: a(:), b(:)
    integer ier
    deallocate( a, b, stat=ier )
end

!--------------------------------------------------------
subroutine mymovealloc( a, b )
    implicit none
    integer, allocatable :: a(:), b(:)
    call move_alloc( a, b )
end

!--------------------------------------------------------
program main
    use myvar
    implicit none
    integer a, dat
    common /com/ dat

    call mymain_int
    print *, a, dat, foo
contains
    subroutine mymain_int
        integer b
        a = 1
        b = 2
        dat = 100
        foo = 200
    end subroutine
end program

!--------------------------------------------------------
module mymod
    use myvar
    implicit none
    integer bar
contains
    subroutine mymod_sub
        use myvar
        integer a, dat
        common /com/ dat

        call mymod_sub_int
        bar = 300
        print *, a, dat, foo, bar
    contains
        subroutine mymod_sub_int
            integer b
            a = 1
            b = 2
            dat = 100
            foo = 200
        end subroutine
    end subroutine
end module

Result:

------------------------------------------------------------
mysub:
    arg: ['a', 'b', 'c', 'ios', 'n', 'p', 'q', 'r']
        ios         (iostat)
        a           (read)
        b           (read)
        c           (read)
        c           (assign)
    com:
        dat         (assign)
        x           (assign)
        y           (assign)
    mod:
        str         (write)
    res:
    call: []
    namelist: ['list']
------------------------------------------------------------
mysub2:
    arg: ['a1', 'a10', 'a2', 'a3', 'a4', 'a5', 'a6', 'a7', 'a8', 'a9', 'b1', 'b10', 'b2', 'b3', 'b4', 'b5', 'b6', 'b7', 'b8', 'b9', 'c1', 'c10', 'c2', 'c3', 'c4', 'c5', 'c6', 'c7', 'c8', 'c9']
        a3          (assign)
        b5          (assign)
    com:
    mod:
    res:
    call: []
------------------------------------------------------------
myfunc:
    arg: ['m', 'x']
        m           (assign)
    com:
        dat2        (assign)
    mod:
    res:
        myfunc      (assign)
    call: []
------------------------------------------------------------
myfunc2:
    arg: []
    com:
    mod:
        obj         (assign)
    res:
        res         (assign)
    call: ['sub2']
------------------------------------------------------------
myalloc:
    arg: ['a', 'ier']
        ier         (alloc)
        a           (alloc)
    com:
    mod:
    res:
    call: []
------------------------------------------------------------
mydealloc:
    arg: ['a', 'b', 'ier']
        ier         (dealloc)
        a           (dealloc)
        b           (dealloc)
    com:
    mod:
    res:
    call: []
------------------------------------------------------------
mymovealloc:
    arg: ['a', 'b']
        a           (move_alloc)
        b           (move_alloc)
    com:
    mod:
    res:
    call: ['_gfortran_move_alloc']
------------------------------------------------------------
main:
    arg: []
    com:
    mod:
    res:
    misc:
    call: ['mymain_int']
------------------------------------------------------------
mymain_int:
    arg: []
    com:
    mod:
    res:
    misc:
        main:a      (assign)
        main:dat    (assign)
        main:foo    (assign)
    call: []
------------------------------------------------------------
mymod_sub:
    arg: []
    com:
    mod:
    res:
    misc:
        mymod:bar   (assign)
    call: ['mymod_sub_int']
------------------------------------------------------------
mymod_sub_int:
    arg: []
    com:
    mod:
    res:
    misc:
        mymod_sub:a    (assign)
        mymod_sub:dat  (assign)
        mymod_sub:foo  (assign)
    call: []
roygvib
  • 7,218
  • 2
  • 19
  • 36
  • The python script is a great one. I already accepted the answer as it is what I was asking for. I just try it on some test programs and it is working. You actually provide more valuable tools than I expected like the common bloc variables, list of called functions. I will definitely use it my actual work and give you feedback. I will do more testing today and come back with questions. – innoSPG Aug 14 '16 at 17:01
  • 1
    No problem, actually, I have been also looking for this kind of tools (because I was severely troubled by some large package before). I made this community wiki, so I hope anyone add anything if necessary (e.g., bug!!!) – roygvib Aug 14 '16 at 17:04
  • One thing if you don't mind, is it easy for you to account for global variables? For now the program crashes when global variables are accessed. – innoSPG Aug 14 '16 at 17:04
  • You are right. Most global variables are in common or modules. Some can be in the main program and used by subroutines in the contains section of the main program; these are those I am having issue with. – innoSPG Aug 14 '16 at 17:23
  • 1
    Now the script works also for cases with internal subprograms (in a rather ad-hoc way!), but the reported "host" name may be incorrect... in that case, please don't mind :) – roygvib Aug 15 '16 at 22:21