0

I want to move from MATLAB to open-source alternatives as scipy and numpy. However, I have some problems with the speed. I am aware that sometimes multi-core operations can be slower than single core due to overheading, however, I am trying to do a process that was actually speeded-up by MATLAB.

I have a function that does some math on each pixel of an 2D matrix. I use 3 nested loops to do that.

def reconstruct2D(frame, parameters):
    """
    Does some nested for loop operations on 2D data
    """
    for channel_i in range(nr_cols):  # for every channel
        for y in range(nr_rows):
            for x in range(nr_cols):
                # Do some calculations here

what I normally did in MATLAB was to call this function for each 3rd dimnsion of a 3D matrix.

parfor frameNo = 1:N
    result(:,:,frameNo) = reconstruct2D(rawFrame(:,:,frameNo), parameters);
end

Then it becomes four times faster, when 4 core is active. However, when I try the same thing by using Joblib, it still does it in an order.

import numpy as np
from scipy import signal
from joblib import Parallel, delayed


def reconstruct2D(frame, parameters):
    # Same as above

if __name__ == '__main__':
    print('Main Loop is running...')
    Parallel(n_jobs=4)(delayed(reconstruct2D(frame[:, :, indx], parameters)) for indx in range(N))
    print('Main Loop is finished...')

Processing time of a single frame is also much slower in Python. It takes 1.8s in MATLAB and 19s in Python.

I have two questions basically:

  1. Does anybody have an idea about the reason why a single frame processing is 10 times slower in Python?
  2. Why joblib calculates frames in an order and not concurrently?

I am using Python 3.5 in Windows7 64bits hardware with 4 cores.

  • Newer MATLAB uses just-in-time compilation to speed up routine loops. `numpy` does not have that, and requires the same sort of array operations that older MATLAB required to be fast. Iterative stuff can be speed up with extra packages like `cython` and `numba`. But first make sure you are doing as much of your calculations as you can without loops. – hpaulj Oct 30 '16 at 21:36
  • Thanks a lot @hpaulj ! I have made some research about JIT and helped me a lot. Just adding `@jit` decorator increased my speed significantly. Now MATLAB with 4-core parfpor I have 4.6 sec and with single-core Python it takes only 6.7 sec. However, I still could not succeed to parallelize the process. – umitarabul Oct 31 '16 at 09:34
  • @umitarabul: numba also provides a `prange` function that parallelizes loops very much like MATLAB's `parfor` – TheBlackCat Oct 31 '16 at 19:38
  • @TheBlackCat: It was not supported anymore as far as I know. – umitarabul Nov 02 '16 at 21:11

0 Answers0