Updating long-running Fortran subroutine in Python GUI using f2py

Question

I've got a Python GUI (wxPython) which wraps around a fortran "back-end," using f2py. Sometimes, the fortran process(es) may be quite long running, and we would like to put a progress bar in the GUI to update the progress through the Fortran routine. Is there any way to get the status/progress of the Fortran routine, without involving file I/O?

First you have to have a sense for what is taking a long time in the fortran routine. Have you profiled the fortran code to see where the code is spending a long time? Also, why are you not keen to use file I/O? A simple debugging statement or progress output is the usual way to track progress... — Ross, Sep 30 '15 at 19:27
@Ross it is just a long process. I'm not trying to profile/debug, I would like to put a progress bar into my GUI to show the user how long to expect it to run. Apologies if this was unclear. I'm not keen on adding file I/O because of its added overhead, but that is of course my only option at this point. — brettb, Sep 30 '15 at 19:29
But you need to profile and you need some way to send the message about where you actually are in tye code just now. My 2cents: forget it, it is MUCH more work you probably think it might be. — Vladimir F Героям слава, Sep 30 '15 at 19:32
I agree with @VladimirF - how can you know your progress if you don't know what the internal code timings are like? I guess you could make an a-priori guess using some heuristic, but that doesn't sound very good. — Ross, Sep 30 '15 at 19:36
I/O is *not* heavy unless it's a lot of data being output, by the way. I was talking about a simple `completed step 1 of 5`. — Ross, Sep 30 '15 at 19:37
Thanks for all the feedback. Basically, my FORTRAN code will run through a loop to check some data. The loop on a given run may be a few thousand iterations, or 20 million iterations. In the 20 million iterations, this might take hours or days to run through. So I basically just want to tell the Python GUI which loop iteration the backend is on. In the case of millions of iterations, writing "i, N" to a file is low overhead. In the case of smaller runs, I thought this might be a little too much overhead. — brettb, Sep 30 '15 at 21:08
@brettb My answer addresses exactly that problem - you should use a `mod` statement to output the iteration number (or percent complete) only infrequently. — Ross, Sep 30 '15 at 21:30
@Ross: thanks. Yes, your answer was pretty much my fall-back plan. What I was hoping to find with my question was if there was a more direct way to pass information between the fortran and python codes, as is the purpose of using f2py in the first place. Appreciate your help, though! — brettb, Sep 30 '15 at 21:34

Status · Answer 1 · 2021-12-12T14:50:51.660

I found a method to do so - whether it's a good method or not, I can't say, but I have tried it and it does work. I wrote it up elsewhere, which I'll post here as well. This method does involve having the Fortran source code however, so it probably won't work for you if you're stuck with just the compiled file.

First, in the Fortran code, define integers named progress and max_prog within your module:

module fblock
  use iso_fortran_env
  integer(kind=int32), save :: progress = 0
  integer(kind=int32), save :: max_prog = 1
  
  contains
    subroutine long_runner(...)

The save flag is used here so the variables don't go undefined outside of its scope (I.E. accessing them before or after a subroutine is called/finished, which is something that could happen in the next steps). If you're using Fortran 2008 or later, the save flag isn't necessary as module variables are always saved.

Then, in the long running subroutine, add a line telling f2py to unlock python's Global Interpreter Lock (GIL):

module fblock
  use iso_fortran_env
  integer(kind=int32), save :: progress = 0
  integer(kind=int32), save :: max_prog = 1

  contains
    subroutine long_runner(...)
      !f2py threadsafe

Unlocking the GIL prevents python from becoming completely unresponsive while this Fortran block runs and thus allowing a separate thread in Python to run during its execution (which is the next step after this; I don't know enough about thread safety to say much about it, but this step is somewhat needed to make the whole thing work). Finally, simply add a one to the progress variable in your code:

module fblock
  use iso_fortran_env
  integer(kind=int32), save :: progress = 0
  integer(kind=int32), save :: max_prog = 1

  contains
    subroutine long_runner(input_data, output_data)
      !f2py threadsafe

      ! other code ...

      max_prog = giant_number

      ! possibly more code...

      do i = 1, giant_number
        progress = progress + 1
        ! yet more code...

You'll have to adapt this to how your code runs depending on whether or not it runs in a giant do loop or not but you're simply increasing a number. Note, if you're using openmp for parallel work, sum the progress just on the first thread/processor:

subroutine long_runner(input_data, output_data)
  use omp

  ! code...

  max_prog = giant_number / omp_get_num_procs()

  !$OMP PARALLEL
  proc_num = omp_get_thread_num()
  ! ...

  !$OMP DO
  do i = 1, giant_number
    ! ...
    if (proc_num == 0) then
      progress = progress + 1
    end if
    ! ...

Now, once you've compiled that with f2py into a python module, it's time for the second step and handle the Python side. Say, for example, your Fortran module 'fblock' with subroutine 'long_runner' has been compiled into the file 'pyblk.pyd'. Import pyblk along with the threading and time modules:

import pyblk  # your f2py compiled fortran block
import threading
import time
 
global bg_runn
bg_runn = True
 
def background():
    "background query/monitoring thread"
    time.sleep(0.1)  # wait a bit for foreground code to start
    while bg_runn:
        a = pyblk.fblock.progress
        b = pyblk.fblock.max_prog
        if a >= b: break
 
        print(a, 'of', b, '({:.3%})'.format(a / b))
        time.sleep(2)
 
# start the background thread
thrd = threading.Thread(target=background)
thrd.start()
 
print(time.ctime())
# then call the compiled fortran
output_data = pyblk.fblock.long_runner(init_data)
 
bg_runn = False  # stop the background thread once done
thrd.join()  # wait for it to stop (wait out the sleep cycle)
print(time.ctime())

Only the progress and max_prog are being read (reading doesn't modify) during the whole thing on the Python side. Everything else on the Fortran block are internal to the subroutine and nothing was set up to interfere with those anyway - progess and max_prog were the only variables made to be looked at outside of the subroutine. The output might look something like this from Python:

...
1026869 of 4793490 (21.422%)
1056318 of 4793490 (22.037%)
1086679 of 4793490 (22.670%)
1116830 of 4793490 (23.299%)
...

This adds a negligible amount to the run time (if any; I didn't notice a difference in time while testing it).

Now, to tie this into a GUI with a fancy progress bar, it gets far more complicated, as both the GUI and the Fortran block have to run in the foreground. You can't just run the Fortran in a background thread (At least, I can't - Python crashes entirely for me). So you have to start an entirely separate process where Fortran can run, with Python's multiprocessing module.

+===========+---------------+===========+
|  Primary  | -- Spawns --> | Secondary |
|  Process  |               |  Process  |
+===========+               +===========+
|foreground:|               |foreground:|
|    GUI    |               |  fortran  |
+- - - - - -+               +- - - - - -+
|background:|               |background:|
|           |  <-- Queue -- |   query   |
+-----------+               +-----------+

So the setup is to have the GUI as the primary process, which starts up a secondary process where your Fortran code can run with its own background query thread. A multiprocessing.Queue is set up (to pass information back to the GUI) and that is given to a multiprocessing.Process (the secondary) which then starts the query thread and runs. This query thread on the secondary puts its findings into the Queue instead of printing them out like the above. Back in the primary process, information is pulled out of the queue and used to set the progress bar. I'm not familiar with wxPython, but here's an example of this whole complication using another GUI library, PySide2:

from multiprocessing import shared_memory
import multiprocessing as mp
import numpy as np
import threading
import sys, os
import time
 
import pyblk  # your f2py compiled fortran block
 
# to use PyQt5, replace 'PySide2' with 'PyQt5'
from PySide2.QtWidgets import (QApplication, QWidget, QVBoxLayout,
                               QProgressBar, QPushButton)
 
 
global bg_runn
bg_runn = True
 
# this query is run in the background on the secondary process 
def bg_query(prog_que):
    "background query thread for compiled fortran block"
    global bg_runn
    current, total = 0, 1
 
    # wait a bit for fortran code to initialize the queried variables
    time.sleep(0.25)
     
    while bg_runn:
        # query the progress
        current = pyblk.fblock.progress
        total   = pyblk.fblock.max_prog
 
        if current >= total: break
        prog_que.put((current, total))
        time.sleep(0.1)  # this can be more or less depending on need
         
    prog_que.put((current, total))
    prog_que.put('DONE')  # inform other end that this is complete
    return
 
# this fortran block is run on the secondary process
def run_fortran(prog_que, init_data):
    "call to run compiled fortran block"
    global bg_runn
     
    # setup/start background query thread
    thrd = threading.Thread(target=bg_query, args=(prog_que, ))
    thrd.start()
     
    # call the compiled fortran code
    results = pyblk.fblock.long_runner(init_data)
     
    bg_runn = False  # inform query to stop
    thrd.join()  # wait for it to stop (wait out the sleep cycle)
     
    # now, do something with the results or
    # copy the results out from this process
    ##shm = shared_memory.SharedMemory('results')  # connect to shared mem
    ##b = np.ndarray(results.shape, dtype=results.dtype, buffer=shm.buf)
    ##b[:] = img_arr[:]  # copy results (memory is now allocated)
    ##shm.close()  # disconnect from shared mem
    return
 
 
# this GUI is run on the primary process
class ProgTest(QWidget):
    "progess test of compiled fortran code through python"
    def __init__(self, parent=None):
        super().__init__()
        # setup/layout of widget
        self.pbar = QProgressBar()
        self.pbar.setTextVisible(False)
 
        self.start_button = QPushButton('Start')
        self.start_button.clicked.connect(self.run_the_thing)
         
        ly = QVBoxLayout()
        ly.addWidget(self.start_button)
        ly.addWidget(self.pbar)
        self.setLayout(ly)
         
    def run_the_thing(self):
        "called on clicking the start button"
        self.setEnabled(False)  # prevent interaction during run
        app.processEvents()
 
        t0 = time.time()
        print('start:', time.ctime(t0))
         
        prog_que = mp.Queue()  # progress queue
 
        # if wanting the results on the primary process:
        # create shared memory to later copy result array into
        # (array size is needed; no memory is used/allocated at this time)
        ##shm = shared_memory.SharedMemory('results', create=True,
        ##                                 size=np.int32(1).nbytes * amount)
 
        init_data = None  # your initial information, if any
        # if it's large and on disk, read it in on the secondary process
 
        # setup/start the secondary process with the compiled fortran code
        run = mp.Process(target=run_fortran, args=(prog_que, init_data))
        run.start()
 
        # listen in on the query through the Queue
        while True:
            res = prog_que.get()
            if res == 'DONE': break
            current, total = res  # unpack from queue
 
            if total != self.pbar.maximum(): self.pbar.setMaximum(total)
 
            self.pbar.setValue(current)
            self.setWindowTitle('{:.3%}'.format(current / total))
            app.processEvents()
        # this while loop can be done on a separate background thread
        # but isn't done for this example
             
        run.join()  # wait for the secondary process to complete
 
        # extract the results from secondary process with SharedMemory
        # (shape and dtype need to be known)
        ##results = np.ndarray(shape, dtype=np.int32, buffer=shm.buf)
 
        t1 = time.time()
        print('end:', time.ctime(t1))
        print('{:.3f} seconds'.format(t1 - t0))
 
        self.pbar.setValue(total)
        self.setWindowTitle('Done!')
        self.setEnabled(True)
        return
 
if __name__ == '__main__':
    app = QApplication(sys.argv)
    window = ProgTest()
    window.show()
    sys.exit(app.exec_())

This method of running it on a secondary process creates a new problem however - your results are on the secondary! How this is handled is entirely dependent on what your results are. If you can deal with them on the secondary (Ex. Save the data after computing them), it might be best to do it there. But if you need to get them back to the primary process to interact with, you'll have to copy them out.

Generally, working with f2py involves numpy so your results are likely a numpy array of some kind. Here's a few methods I tried (using a 1600000000 byte array) for getting it from the secondary process to the primary:

Creating and copying the results using multiprocessing.Array to get them from secondary to primary - this doubles the memory for the entire run time and adds ~30 to 45 seconds to run time.
Stuffing the results into a multiprocessing.Queue to get them from secondary to primary - more than doubles the memory during the stuffing and adds ~5 to 10 seconds to run time - and not necessarily a good use of the Queue.
Copying the results using shared_memory.SharedMemory to get them from secondary to primary - doubles the memory during copy and adds ~1 second to run time. This is the method that's commented out in the above GUI example. I realize this option didn't exist when this question was asked.

Ross · Answer 2 · 2015-09-30T21:37:25.873

If you have a basic understanding of what portions of your code take the most time, you can include a progress indicator. Consider the following example code:

program main
implicit none
integer, parameter :: N = 100000
integer :: i
real, allocatable :: a(:), b(:)

! -- Initialization
write(*,*) 'FORFTP: Beginning routine'

allocate(a(N), b(N))
a = 0.
b = 0.

write(*,*) 'FORFTP: Completed initialization'

do i=1,N
   call RANDOM_NUMBER(a(i))
   b(i) = exp(a(i))     ! Some expensive calculation

   if (mod(i,N/100)==0) then     ! -- Assumes N is evenly divisible by 100
      write(*,*) 'FORFTP: completed ', i*100/N, ' percent of the calculation'
   endif
enddo

write(*,*) 'FORFTP: completed routine'

end program main

The user would then get an update after initialization and after each percent of the 'expensive calculation' is complete.

I don't know how f2py works but I assume there is some way for python to read what fortran is outputting as it runs and display that in it's gui. In this example, anything tagged FORFTP would be output in the gui, and I'm using standard output.

This example illustrates the problem with investigating progress, though. It's difficult to understand how much time the allocation takes compared to the calculation. So it's hard to say that the initialization is 15% of the total execution time, for example.

However, it's still useful to have updates on what's going on, even if they don't have an exact progress meter.

Edit The routine provides the following output:

 >  pgfortran main.f90
 >  ./a.out 
 FORFTP: Beginning routine
 FORFTP: Completed initialization
 FORFTP: completed             1  percent of the calculation
 FORFTP: completed             2  percent of the calculation
  ...
 FORFTP: completed            99  percent of the calculation
 FORFTP: completed           100  percent of the calculation
 FORFTP: completed routine

score 0 · Answer 3 · answered Oct 01 '15 at 08:35

0

At high risk of being down voted, if you know roughly how long each task is going to take, the simplest option is base the progress on how much time has elapsed since the task started, measured against the expected duration of the task.
To keep it relevant you could always store the run duration for the task each time and use that, or an average, as your base time-line.
Sometimes, we can over over complicate things ;)

answered Oct 01 '15 at 08:35

Rolf of Saxony

21,661
5
39
60

"The loop on a given run may be a few thousand iterations, or 20 million iterations. In the 20 million iterations, this might take hours or days to run through. So I basically just want to tell the Python GUI which loop iteration the backend is on." Doesn't seem to be the case. – Vladimir F Героям слава Oct 01 '15 at 08:38

Updating long-running Fortran subroutine in Python GUI using f2py

3 Answers3