Numpy: vectorizing a function that integrate 2D array

Question

I need to perform this following integration for a 2D array: That is, each point in the grid get the value RC, which is integration over 2D of the difference between the whole field and the value of the field U at certain point (x,y), multiplying the normalized kernel, that in 1D version is:

What I did so far is an inefficient iteration over indexes:

def normalized_bimodal_kernel_2D(x,y,a,x0=0.0,y0=0.0):
    """ Gives a kernel that is zero in x=0, and its integral from -infty to 
    +infty is 1.0. The parameter a is a length scale where the peaks of the 
    function are."""
    dist = (x-x0)**2 + (y-y0)**2
    return (dist*np.exp(-(dist/a)))/(np.pi*a**2)


def RC_2D(U,a,dx):
    nx,ny=U.shape
    x,y = np.meshgrid(np.arange(0,nx, dx),np.arange(0,ny,dx), sparse=True)
    UB = np.zeros_like(U)
    for i in xrange(0,nx):
        for j in xrange(0,ny):
            field=(U-U[i,j])*normalized_bimodal_kernel_2D(x,y,a,x0=i*dx,y0=j*dx)
            UB[i,j]=np.sum(field)*dx**2
    return UB

def centerlizing_2D(U,a,dx):
    nx,ny=U.shape
    x,y = np.meshgrid(np.arange(0,nx, dx),np.arange(0,ny,dx), sparse=True)
    UB = np.zeros((nx,ny,nx,ny))
    for i in xrange(0,nx):
        for j in xrange(0,ny):
            UB[i,j]=normalized_bimodal_kernel_2D(x,y,a,x0=i*dx,y0=j*dx)
    return UB

You can see the result of the centeralizing function here:

U=np.eye(20)
plt.imshow(centerlizing(U,10,1)[10,10])

I'm sure I have additional bugs, so any feedback will be warmly welcome, but what I am really interested is understanding how I can do this operation much faster in vectorized way.

I think this question will get better answers at https://codereview.stackexchange.com — Ofer Sadan, Aug 10 '17 at 08:10
Nah, more `numpy` folks hang out here. And this is a pure `numpy` vectorization problem. — Daniel F, Aug 10 '17 at 08:12
Biggest efficiency improvement off the top is that you're creating and running your calculations on a `(nx/dx, ny/dx)` `meshgrid` but only summing over a `(nx, ny)` slice of it. Or maybe that's a bug, hard to tell. Is the output what you expect? — Daniel F, Aug 10 '17 at 08:17

MB-F · Accepted Answer · 2017-08-10T09:12:14.317

normalized_bimodal_kernel_2D is called in the two nested loops that each shift only it's offset by small steps. This duplicates many computations.

An optimization for centerlizing_2D is to compute the kernel once for a bigger range and then define UB to take shifted views into that. This is possible using stride_tricks, which unfortunately is rather advanced numpy.

def centerlizing_2D_opt(U,a,dx):
    nx,ny=U.shape    
    x,y = np.meshgrid(np.arange(-nx//2, nx+nx//2, dx),
                      np.arange(-nx//2, ny+ny//2, dx),  # note the increased range
                      sparse=True)
    k = normalized_bimodal_kernel_2D(x, y, a, x0=nx//2, y0=ny//2)
    sx, sy = k.strides    
    UB = as_strided(k, shape=(nx, ny, nx*2, ny*2), strides=(sy, sx, sx, sy))
    return UB[:, :, nx:0:-1, ny:0:-1]

assert np.allclose(centerlizing_2D(U,10,1), centerlizing_2D_opt(U,10,1)) # verify it's correct

Yep, it's faster:

%timeit centerlizing_2D(U,10,1)      #   100 loops, best of 3:  9.88 ms per loop
%timeit centerlizing_2D_opt(U,10,1)  # 10000 loops, best of 3: 85.9  µs per loop

Next, we optimize RC_2D by expressing it with the optimized centerlizing_2D routine:

def RC_2D_opt(U,a,dx):
    UB_tmp = centerlizing_2D_opt(U, a, dx)
    U_tmp = U[:, :, None, None] - U[None, None, :, :]
    UB = np.sum(U_tmp * UB_tmp, axis=(0, 1))
    return UB

assert np.allclose(RC_2D(U,10,1), RC_2D_opt(U,10,1))

Performance of %timeit RC_2D(U,10, 1):

#original:    100 loops, best of 3: 13.8 ms per loop
#@DanielF's:  100 loops, best of 3:  6.98 ms per loop
#mine:       1000 loops, best of 3:  1.83 ms per loop

score 2 · Answer 2 · answered Aug 10 '17 at 08:51

Assuming dx=1 since I'm not sure what you're trying to do with that discretization:

def normalized_bimodal_kernel_2D(x, y, a):  
    #generating a 4-d tensor instead of 1d vector
    dist = (x[:,None,None,None] - x[None,None,:,None])**2 +\
           (y[None,:,None,None] - y[None,None,None,:])**2    
    return (dist * np.exp(-(dist / a))) / (np.pi * a**2)

def RC_2D(U, a):
    nx, ny = U.shape
    x, y = np.arange(nx), np.arange(ny)
    U4 = U[:, :, None, None] - U[None, None, :, :] #Another 4d
    k = normalized_bimodal_kernel_2D(x, y, a)
    return np.einsum('ijkl,ijkl->ij', U4, k)


def centerlizing_2D(U, a):
    nx, ny = U.shape
    x, y = np.arange(nx), np.arange(ny)
    return normalized_bimodal_kernel_2D(x, y, a)

Basically, vecotrizing for loops in numpy is a matter of adding more dimensions. You were doing two loops over a 2D U vector, so to vectorize just turn it into 4D.

B. M. · Answer 3 · 2017-08-11T12:58:19.363

1

To fit your formula, let U be a function.

Then you just have to put x,y,x',y' in four different dimensions with np.ix_ and carrefully translate your formula. Numpy broadcasting will do the rest.

a=20
x,y,xp,yp=np.ix_(*[np.linspace(0,1,a)]*4)

def U(x,y) : return np.float32(x == y)  # function "eye"

def f(x,y,xp,yp,a):
    r2=(x-xp)**2+(y-yp)**2
    return r2*np.exp(-r2/a)*(U(xp,yp) - U(x,y))/np.pi/a/a

#f(x,y,xp,yp,a).shape is (20, 20, 20, 20)

RC=f(x,y,xp,yp,a).sum(axis=(2,3))
#RC.shape is (20, 20)

edited Aug 11 '17 at 12:58

answered Aug 11 '17 at 11:43

B. M.

18,243
2
35
54

Very interesting solution - but how can I make it work if the input I have is a given 2D array. In my case I am performing time integration of a variable that is a field in 2D, so in each iteration U is different. – Ohm Aug 11 '17 at 15:33

Numpy: vectorizing a function that integrate 2D array

3 Answers3