0

I'm computing a "similarity function", which mostly resembles the cosine similarity measure, although I apply a couple of tricks to deal with ordinal values and missing values.

These tricks basically consist on getting some pre-computed values from some dynamically generated (although "readonly") lookup tables (based on some data distributions).

The problem I'm having is that, when I convert my function into a numpy's ufunc with np.frompyfunc, the fact that the function is accessing a "global" python object makes it impossible for numpy to skip the GIL, and therefore it cannot vectorize/parallelize the execution of that function.

Is there any way to give "complete ownership" (in order to avoid the GIL) of a pre-calculated lookup table to these "ufuncs" (maybe passing a copy)? (taking into account that the values are not known at "coding time").

Thank you in advance.

castarco
  • 1,368
  • 2
  • 17
  • 33
  • I advise you to use Numba/Cython and rewrite the iteration function so that you do not use globals (by passing all the variables to jitted functions). Actually I do not think Numpy can truly vectorize python function code (ie. using hardware SIMD instructions) because AFAIK it do not have an integrated JIT/compiler as opposed to Numba/Cython. – Jérôme Richard Aug 08 '21 at 10:52
  • 1
    What exactly do you mean by `vectorize/parallelize` here? What's the evidence that there's a GIL problem? – hpaulj Aug 08 '21 at 11:28
  • Why are you using `frompyfunc`? What do you expect from it? How do you know GIl and readonly is the issue – hpaulj Aug 08 '21 at 11:42
  • There is not GIL in nopython mode in Numba nor in the equivalent Cython mode (nogil). Numba experimentally support native dicts that will not use the GIL. Assuming the GIL is actually the problem, this would fix the issue as it would not be used anymore. It would be better if you can provide a minimal working example so we can work on a more concrete problem. – Jérôme Richard Aug 08 '21 at 11:43
  • 'lookup table' is not a defined python or numpy class. – hpaulj Aug 08 '21 at 14:36
  • Are you asking this because the performance of the `frompyfunc` `ufunc` is slower than you expect? – hpaulj Aug 08 '21 at 15:21

0 Answers0