How to speed up the 'for' loop in a python function?

Question

I have a function var. I want to know the best possible way to run the for loop (for multiple coordinates: xs and ys) within this function quickly by multiprocessing/parallel processing by utilizing all the processors, cores, and RAM memory the system has.

Is it possible using Dask module?

pysheds documentation can be found here.

import numpy as np
from pysheds.grid import Grid

xs = 82.1206, 72.4542, 65.0431, 83.8056, 35.6744
ys = 25.2111, 17.9458, 13.8844, 10.0833, 24.8306

  
for (x,y) in zip(xs,ys):

    grid = Grid.from_raster('E:/data.tif', data_name='map')         
    grid.catchment(data='map', x=x, y=y, out_name='catch', recursionlimit=1500, xytype='label') 
        ....
        ....
    results

score 1 · Answer 1 · answered Sep 13 '20 at 00:47

You didn't post a link to your image1.tif file so the sample code below uses pysheds/data/dem.tif from https://github.com/mdbartos/pysheds The basic idea is to split the input parameters, xs and ys in your case, into subsets, then give each CPU a different subset to work on.

main() computes the solution twice, once sequentially and once in parallel, then compares the solutions from each. There's some inefficiency in the parallel solution since the image file will be read by each CPU so there's room for improvement (ie, read the image file outside the parallel portion then give the resulting grid object to each instance).

import numpy as np
from pysheds.grid import Grid
from dask.distributed import Client
from dask import delayed, compute

xs = 10, 20, 30, 40, 50, 60, 70, 80, 90, 100
ys = 25, 35, 45, 55, 65, 75, 85, 95, 105, 115, 125

def var(image_file, x_in, y_in):
    grid = Grid.from_raster(image_file, data_name='map')
    variable_avg = []
    for (x,y) in zip(x_in,y_in):
        grid.catchment(data='map', x=x, y=y, out_name='catch')
        variable = grid.view('catch', nodata=np.nan)
        variable_avg.append( np.array(variable).mean() )
    return(variable_avg)

def var_parallel(n_cpu, image_file, x_in, y_in):
    tasks = []
    for cpu in range(n_cpu):
        x_in = xs[cpu::n_cpu] # eg, cpu = 0: x_in = (10, 40, 70, 100)
        y_in = ys[cpu::n_cpu] # 
        tasks.append( delayed(var)(image_file, x_in, y_in) )
    ans = compute(tasks)
    # reassemble solution in the right order
    par_avg = [None]*len(xs)
    for cpu in range(n_cpu):
        par_avg[cpu::n_cpu] = ans[0][cpu]
    print('AVG (parallel)  =',par_avg)
    return par_avg

def main():
    image_file = 'pysheds/data/dem.tif'
    # sequential solution:
    seq_avg = var(image_file, xs, ys)
    print('AVG (sequential)=',seq_avg)
    # parallel solution:
    n_cpu = 3
    dask_client = Client(n_workers=n_cpu)
    par_avg = var_parallel(n_cpu, image_file, xs, ys)
    dask_client.shutdown()
    print('max error=',
        max([ abs(seq_avg[i]-par_avg[i]) for i in range(len(seq_avg))]))

if __name__ == '__main__': main()

I don't understand why don't you use: multiprocessing.Pool ? and set the pool size to the number of cpus or use os.cpu_count() and do: for x, y in zip(xs, ys): _async_proc = processes_pool.apply_async(var, (imag_name, x, y)) — StefanMZ, Sep 14 '20 at 20:17

score 1 · Accepted Answer · answered Sep 16 '20 at 14:54

I tried to give a reproducible code below using dask. You can add the main processing part of the pysheds or any other functions in it for faster parallel iteration of the parameters.

The documentation of the dask module can be found here.

import dask
from dask import delayed, compute
from dask.distributed import Client, progress
from pysheds.grid import Grid

client = Client(threads_per_worker=2, n_workers=2) #Choose the number of workers and threads per worker over here to deploy for your task.

xs = 82.1206, 72.4542, 65.0431, 83.8056, 35.6744
ys = 25.2111, 17.9458, 13.8844, 10.0833, 24.8306

#Firstly, a function has to be created, where the iteration of the parameters is involved. 
def var(x,y):
        
    grid = Grid.from_raster('data.tif', data_name='map')
    grid.catchment(data='map', x=x, y=y, out_name='catch', recursionlimit=1500, xytype='label')
    ...
    ...
    return (result)

#Now calling the function in a 'dask' way. 
lazy_results = []

for (x,y) in zip(xs,ys):
    lazy_result = dask.delayed(var)(x,y)
    lazy_results.append(lazy_result)
       
#Final command to execute the function var(x,y) and get the result.
dask.compute(*lazy_results)

How to speed up the 'for' loop in a python function?

2 Answers2