1

I'm trying to use numba to speed up a slow calculation. It works great with the @njit decorator but I really need it to work as a precompiled ahead-of-time(AOT) module. Sadly I haven't been able to get it to work. Here is the code I use to compile the AOT-module:

from numba.pycc import CC
import numpy as np

cc = CC('window_cloud_scores')
cc.verbose = True
cc.output_dir='/cache'
cc.output_file='window_cloud_scores.so'


@cc.export('run', 'f8[:,:](u1[:,:], i4)')
def run(clouds,window):
    r=int(window/2)
    assert clouds.ndim==2
    assert clouds.shape[0]==clouds.shape[1]
    rows,cols=clouds.shape
    score_map=np.full(clouds.shape,-1)
    scores=[]
    for j in range(r,rows-r):
        score_cols=[]
        for i in range(r,cols-r):
            clouds_window=clouds[j-r:j+r+1,i-r:i+r+1]
            score_cols.append(clouds_window.mean())
        scores.append(score_cols)
    return np.array(scores)


if __name__ == "__main__":
    cc.compile()

When I compile the module it creates the window_cloud_scores.so file but gives the following warning:

/Users/.../lib/python3.6/site-packages/numba/pycc/../runtime/_nrt_python.c:234:55: warning: incompatible pointer types passing 'PyTypeObject *' (aka 'struct _typeobject *') to parameter of type 'PyObject *' (aka 'struct _object ') [-Wincompatible-pointer-types] mi = (MemInfoObject)PyObject_CallFunctionObjArgs(&MemInfoType, addr, NULL); ^~~~~~~~~~~~ /Users/.../python3.6m/abstract.h:425:68: note: passing argument to parameter 'callable' here PyAPI_FUNC(PyObject *) PyObject_CallFunctionObjArgs(PyObject *callable,

And then when I try to run

import window_cloud_scores as wcs
wcs.run(...)

I get a segmentation fault: 11 in the python console and it a jupyter notebook the kernel dies.

And again,

@njit
def run(clouds,window):
    r=int(window/2)
    assert clouds.ndim==2
    assert clouds.shape[0]==clouds.shape[1]
    rows,cols=clouds.shape
    score_map=np.full(clouds.shape,-1)
    scores=[]
    for j in range(r,rows-r):
        score_cols=[]
        for i in range(r,cols-r):
            clouds_window=clouds[j-r:j+r+1,i-r:i+r+1]
            score_cols.append(clouds_window.mean())
        scores.append(score_cols)
    return np.array(scores)

Works great. Thoughts?

brook
  • 247
  • 2
  • 15
  • Works for me. What's your numba version? – MSeifert May 28 '19 at 19:08
  • Btw: The way you're using numba isn't really utilizing the power of numba. I wouldn't expect "really significant" speed-ups with this function except when you have small really small windows. – MSeifert May 28 '19 at 19:09
  • @MSeifert - Thanks for jumping on this. I'm using `numba==0.43.1`, its in a conda environment (`Python 3.6.7`) if that makes a difference. – brook May 28 '19 at 20:23
  • @MSeifert - as for speed ups. My window size in 17. I would figure most of the speed up though was from the nested for loops... ie `clouds_window.mean()` is pretty fast for almost any size window, the issue is that it has to be computed for every pixel. All that said - i'm new to numba and not sure where its strengths are. For my test case: its about 2 seconds without numba, 700ms for the first numba run and 100ms for the subsequent runs – brook May 28 '19 at 20:29
  • Hm have you tried [scipy convolve](https://docs.scipy.org/doc/scipy-0.16.0/reference/generated/scipy.ndimage.filters.convolve.html#scipy.ndimage.filters.convolve)? It's not exactly the same but very similar. – MSeifert May 28 '19 at 20:44
  • I hadn't seen scipy-convolve before. Not quite what I need but its good to know its there. Thanks! – brook May 29 '19 at 11:42

0 Answers0