I'm trying to understand how to integrate c-routines into my python scripts. I'm testing the addition of two numpy arrays.
I've got a C file, called test.c
void add(int count, float* array_a,float* array_b, float* array_c)
{
int ii,jj;
for (ii=0;ii<count;ii++){
array_c[ii] = array_a[ii]+array_b[ii];
}
}
One can compile this into a .so (shared object) with:
gcc -c -fPIC test.c -o test.o
gcc test.o -shared -o test.so
This will allow me to call the function "add" in python and execute the c code.
import numpy as np
import ctypes
size=100
a=np.ones(size).astype(np.float32)
b=np.ones(size).astype(np.float32)
c=np.zeros(size).astype(np.float32)
from numpy.ctypeslib import ndpointer
lib = ctypes.cdll.LoadLibrary('./test.so')
fun = lib.add #using the c-function
fun.restype = None
fun.argtypes = [ctypes.c_int,
ndpointer(ctypes.c_float),
ndpointer(ctypes.c_float),
ndpointer(ctypes.c_float)]
#I'm giving the c-function the 3 pointers pointing to the a,b and c arrays in memory.
%timeit fun(size,a,b,c)
%timeit c=a+b
The c function requires 11us, while the numpy addition requires 442 ns. Where does this difference in timing come from ? Where is the hidden cost here ?