2

I'm trying to write a section of code that computes the curl of a vector field numerically to second order with periodic boundary conditions. However, the algorithm I made is very slow and I'm wondering if anyone knows of any alternative algorithms.

To give more specific context: I'm using a 3xAxBxC numpy array as my vector field where the first axis refers to the Cartesian direction (x,y,z) and A,B,C refer to the number of bins in that Cartesian direction (i.e the resolution). So for example, I might have a vector field F = np.zeros((3,64,64,64)) where Fx = F[0] is a 64x64x64 Cartesian lattice in its own right. So far, my solution was to use the 3-point centered difference stencil to calculate the derivatives and used a nested loop to iterate over all the different dimensions using modular arithmetic to enforce the periodic boundary conditions (see below for example). However, as my resolution increases (the size of A,B,C) this begins to take a long time (upwards 2 minutes, which adds up if I do this several hundred times for my simulation - this is just one small part of a larger algorithm). I was wondering if anyone know of an alternative method for doing this?

import numpy as np

F =np.array([np.ones([128,128,128]),2*np.ones([128,128,128]),
               3*np.ones([128,128,128])])


VxF =np.array([np.zeros([128,128,128]),np.zeros([128,128,128]),
               np.zeros([128,128,128])])

for i in range(0,128):
     for j in range(0,128):
          for k in range(0,128):
           VxF[0][i,j,k] = 0.5*((F[2][i,(j+1)%128,k]-
                    F[2][i,j-1,k])-(F[1][i,j,(k+1)%128]-F[1][i,j,k-1]))
           VxF[1][i,j,k] = 0.5*((F[0][i,j,(k+1)%128]-
                    F[0][i,j,k-1])-(F[2][(i+1)%128,j,k]-F[2][i-1,j,k]))
           VxF[2][i,j,k] = 0.5*((F[1][(i+1)%128,j,k]-
                    F[1][i-1,j,k])-(F[0][i,(j+1)%128,k]-F[0][i,j-1,k]))

Just to re-iterate, I'm looking for an algorithm that'll compute the curl of a vector field array to second order given periodic boundary conditions faster than the one I have. Maybe there's nothing that will do this, but I just want to check before I keep spending time running this algorithm. Thank. you everyone in advance!

1 Answers1

6

There may be better tools for this, but here is a trivial 200x speedup with numba:

import numpy as np
from numba import jit

def pure_python():
    F =np.array([np.ones([128,128,128]),2*np.ones([128,128,128]),
                3*np.ones([128,128,128])])


    VxF =np.array([np.zeros([128,128,128]),np.zeros([128,128,128]),
                np.zeros([128,128,128])])

    for i in range(0,128):
        for j in range(0,128):
            for k in range(0,128):
                VxF[0][i,j,k] = 0.5*((F[2][i,(j+1)%128,k]-
                            F[2][i,j-1,k])-(F[1][i,j,(k+1)%128]-F[1][i,j,k-1]))
                VxF[1][i,j,k] = 0.5*((F[0][i,j,(k+1)%128]-
                            F[0][i,j,k-1])-(F[2][(i+1)%128,j,k]-F[2][i-1,j,k]))
                VxF[2][i,j,k] = 0.5*((F[1][(i+1)%128,j,k]-
                            F[1][i-1,j,k])-(F[0][i,(j+1)%128,k]-F[0][i,j-1,k]))

    return VxF

@jit(fastmath=True)
def with_numba():
    F =np.array([np.ones([128,128,128]),2*np.ones([128,128,128]),
                3*np.ones([128,128,128])])


    VxF =np.array([np.zeros([128,128,128]),np.zeros([128,128,128]),
                np.zeros([128,128,128])])

    for i in range(0,128):
        for j in range(0,128):
            for k in range(0,128):
                VxF[0][i,j,k] = 0.5*((F[2][i,(j+1)%128,k]-
                            F[2][i,j-1,k])-(F[1][i,j,(k+1)%128]-F[1][i,j,k-1]))
                VxF[1][i,j,k] = 0.5*((F[0][i,j,(k+1)%128]-
                            F[0][i,j,k-1])-(F[2][(i+1)%128,j,k]-F[2][i-1,j,k]))
                VxF[2][i,j,k] = 0.5*((F[1][(i+1)%128,j,k]-
                            F[1][i-1,j,k])-(F[0][i,(j+1)%128,k]-F[0][i,j-1,k]))

    return VxF

The pure Python version takes 13 seconds on my machine, while the numba version takes 65 ms.

hilberts_drinking_problem
  • 11,322
  • 3
  • 22
  • 51
  • 1
    WOW! This is pretty amazing. Thank you very much! When I started work this morning I ran my full algorithm and it took ~ 1 hour. I just added your change and it ran in ~2 minutes. I'm going to leave the question open for a bit longer to see if there are any other algorithms I can learn for my own knowledge, but this certainly solves my problem. Again, thank you so much! – P. Reinecke Aug 05 '19 at 20:16
  • No worries, `numba` is a great tool and is very useful when you have a bunch of nested loops with simple math inside. – hilberts_drinking_problem Aug 05 '19 at 20:17
  • 3
    Just be aware that the `fastmath` option breaks IEEE 754 compliance, and doesn't check for `Inf` and `NaN`. This might be fine for you, but it seems a little dangerous when computing derivatives. – Joey Dumont Aug 07 '19 at 18:16