2

Setup

I have the following two implementations of a matrix-calculation:

  1. The first implementation uses a matrix of shape (n, m) and the calculation is repeated in a for-loop for repetition-times:
import numpy as np
from numba import jit

@jit
def foo():
    for i in range(1, n):
        for j in range(1, m):

            _deleteA = (
                        matrix[i, j] +
                        #some constants added here
            )
            _deleteB = (
                        matrix[i, j-1] +
                        #some constants added here
            )
            matrix[i, j] = min(_deleteA, _deleteB)

    return matrix

repetition = 3
for x in range(repetition):
    foo()


2. The second implementation avoids the extra for-loop and, hence, includes repetition = 3 into the matrix, which is then of shape (repetition, n, m):

@jit
def foo():
    for i in range(1, n):
        for j in range(1, m):

            _deleteA = (
                        matrix[:, i, j] +
                        #some constants added here
            )
            _deleteB = (
                        matrix[:, i, j-1] +
                        #some constants added here
            )
            matrix[:, i, j] = np.amin(np.stack((_deleteA, _deleteB), axis=1), axis=1)

    return matrix


Questions

Regarding both implementations, I discovered two things regarding their performance with %timeit in iPython.

  1. The first implementation profits hugely from @jit, while the second does not at all (28ms vs. 25sec in my testcase). Can anybody imagine why @jit does not work anymore with a numpy-array of shape (repetition, n, m)?


Edit

I moved the former second question to an extra post since asking multiple questions is concidered bad SO-style.

The question was:

  1. When neglecting @jit, the first implementation is still a lot faster (same test-case: 17sec vs. 26sec). Why is numpy slower when working on three instead of two dimensions?
Markus
  • 2,265
  • 5
  • 28
  • 54

1 Answers1

3

I'm not sure what your setup is here, but I re-wrote your example slightly:

import numpy as np
from numba import jit

#@jit(nopython=True)
def foo(matrix):
    n, m = matrix.shape
    for i in range(1, n):
        for j in range(1, m):

            _deleteA = (
                        matrix[i, j] #+
                        #some constants added here
            )
            _deleteB = (
                        matrix[i, j-1] #+
                        #some constants added here
            )
            matrix[i, j] = min(_deleteA, _deleteB)

    return matrix

foo_jit = jit(nopython=True)(foo)

and then timings:

m = np.random.normal(size=(100,50))

%timeit foo(m)  # in a jupyter notebook
# 2.84 ms ± 54.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit foo_jit(m)  # in a jupyter notebook
# 3.18 µs ± 38.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

So here numba is a lot faster as expected. One thing to consider is that global numpy arrays do not behave in numba as you might expect:

https://numba.pydata.org/numba-doc/dev/user/faq.html#numba-doesn-t-seem-to-care-when-i-modify-a-global-variable

It's usually better to pass in the data as I did in the example.

Your issue in the second case is that numba does not support amin at this time. See:

https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html

You can see this if you pass nopython=True to jit. So in current versions of numba (0.44 or earlier at current), it will fall back to objectmode which often is no faster than not using numba and sometimes is slower since there is some call overhead.

JoshAdel
  • 66,734
  • 27
  • 141
  • 140
  • Thanks Josh for the hint with `np.amin` and that `@jit(nonpython=True)` shows this! (+1). Concerning passing the data: You're right, that's important. In my actual implementation (can be found [here](https://github.com/CooleKarotte/timeSeries/blob/master/TWED/memoryEfficient_TWED.py)) I did this. Concerning question no. 2: Do you also have a clue why this might be? – Markus Jul 11 '19 at 16:57
  • Here the second [link](https://github.com/CooleKarotte/timeSeries/blob/master/TWED/multiInput_TWED.py) to the second implementation for the sake of completeness. – Markus Jul 11 '19 at 17:01