I have been testing the following block for numba speed up:
import numpy as np
import timeit
from numba import njit
import numba
@numba.guvectorize(["void(float64[:],float64[:],float64[:],float64, float64, float64[:])"],
"(m),(m),(m),(),()->(m)",nopython=True,target="parallel")
def func_diff_calc_numba_v2(X,refY,Y,lower,upper,arr):
fac=1000
for i in range(len(X)):
if X[i] >=lower and X[i] <upper:
diff=Y[i]-refY[i]
arr[i] = diff**2*fac
else:
arr[i] = 0
@numba.vectorize('(float64, float64, float64, float64, float64)',nopython=True,target="parallel")
def func_diff_calc_numba_v3(X,refY,Y,lower,upper):
fac=1000
if X >= lower and X < upper:
return (Y-refY)**2*fac
else:
return 0.0
@njit
def func_diff_calc_numba(X,refY,Y,lower,upper):
fac=1000
arr=np.zeros(len(X))
for i in range(len(X)):
if X[i] >=lower and X[i] <upper:
arr[i]=(Y[i]-refY[i])**2*fac
else:
arr[i] = 0
return arr
np.random.seed(69)
X=np.arange(10000)
refY = np.random.rand(10000)
Y = np.random.rand(10000)
lower=1
upper=10000
print("func_diff_calc_numba: {:.5f}".format(timeit.timeit(stmt="func_diff_calc_numba(X,refY,Y,lower,upper)", number=10000, globals=globals())))
print("func_diff_calc_numba_v2: {:.5f}".format(timeit.timeit(stmt="func_diff_calc_numba_v2(X,refY,Y,lower,upper)", number=10000, globals=globals())))
print("func_diff_calc_numba_v3: {:.5f}".format(timeit.timeit(stmt="func_diff_calc_numba_v3(X,refY,Y,lower,upper)", number=10000, globals=globals())))
The speedups for the v2 and v3 are significantly different:
func_diff_calc_numba: 0.58257
func_diff_calc_numba_v2: 0.49573
func_diff_calc_numba_v3: 1.07519
and if I change the number of iterations from 10,000 to 100,000 then:
func_diff_calc_numba: 1.67251
func_diff_calc_numba_v2: 4.85828
func_diff_calc_numba_v3: 11.63361
I was expecting vectorize and guvectorize to be almost similar in speedup but while njit and guvectorize are almost equal to each other in time, vectorize is ~2 and ~10 times slower than guvectorize and njit respectively. Is there is something wrong in my implementation or something else?