I have this code:
output_array = np.vectorize(f, otypes='d')(input_array)
And I'd like to replace it with this code, which is supposed to give the same output:
output_array = np.ndarray(input_array.shape, dtype='d')
for i, item in enumerate(input_array):
output_array[i] = f(item)
The reason I want the second version is that I can then start iterating on output_array
in a separate thread, while it's being calculated. (Yes, I know about the GIL, that part is taken care of.)
Unfortunately, the for
loop is very slow, even when I'm not processing the data on separate thread. I benchmarked it on both CPython and PyPy3, which is my target platform. On CPython it's 3 times slower than vectorize
, and on PyPy3 it's 67 times slower than vectorize
!
That's despite the fact that the Numpy documentation says "The vectorize
function is provided primarily for convenience, not for performance. The implementation is essentially a for loop."
Any idea why my implementation is slow, and how to make a fast implementation that still allows me to use output_array
before it's finished?