How to force a function to broadcast without invoking `np.vectorize`

Question

I want to look for a way to force a function to broadcast.

There are scenarios in which the function/method may be overwritten in a later instance, to constant function. In such case if

arr = np.arange(0, 1, 0.0001)
f = lambda x: 5
f(arr) # this gives just integer 5, i want [5, 5,..., 5]

I am aware of methods like np.vectorize which force the function to broadcast, but the problem is this is inefficient, as it is essentially for loop under the hood. (see documentation)

We can also use factory methods like np.frompyfunc which allows us to transform python function to numpy universal function ufunc See here for instance. This outperformed np.vectorize, but still is way less efficient than builtin ufunc methods.

I was wondering if there is any efficient numpy way of handling this, namely to force the function to broadcast?

user2357112 · Accepted Answer · 2019-05-15T23:45:31.950

1

If there was a better way to make arbitrary Python functions broadcast, numpy.vectorize would use it. You really have to write the function with broadcasting in mind if you want it to broadcast efficiently.

In the particular case of a constant function, you can write a broadcasting constant function using numpy.full:

def f(x):
    return numpy.full(numpy.shape(x), 5)

numba.vectorize can also vectorize functions more effectively than numpy.vectorize, but you need Numba, and you need to write your function in a way that Numba can compile efficiently.

edited May 15 '19 at 23:45

answered May 15 '19 at 22:52

user2357112

260,549
28
431
505

Thanks, I am aware of `numba` as well. Nonetheless, I was hoping to find something more generic like function wrapper, which I can just put a decorator to those function. – Wunderbar May 15 '19 at 23:11
and a mind reminder: `np.full(shape, val)` takes shape first – Wunderbar May 15 '19 at 23:38
Whoops, that was indeed the wrong argument order. Fixed. – user2357112 May 15 '19 at 23:45

Wunderbar · Answer 2 · 2019-05-16T18:49:22.713

For those who can live without generic answer, the best answer would be np.full_like(arr, val) which improves by about 20% than np.full(arr.shape, val)

And after raising this issue to author, I found some best middle ground which achieves both generality while perform rather well:

np.broadcast_arrays(x, f(x))[1]

and here are some time analysis:

arr = np.arange(1, 2, 0.0001).reshape(10, -1)

def master_f(x): return np.broadcast_arrays(x, f(x))[-1].copy('K')
def master_f_nocopy(x): return np.broadcast_arrays(x, f(x))[-1]
def vector_f(x): return np.vectorize(f)(x)

%timeit arr+1 # this takes about 10microsec
%timeit master_f(arr) # this takes about 40 mircrosec
%timeit master_f_nocopy(arr) # this takes about 20 microsec

Note this allows one to apply to projection functions such as f(x,y):=y, which is beyond the help of np.full_like.

Moreover, when it comes more complicated function like np.sin and np.cos you'll notice the difference between f(arr) and master_f_nocopy(arr) is almost negligible.

`full_like(arr, val)` takes the dtype from `arr` instead of `val`, though. — user2357112, May 15 '19 at 23:47
The `broadcast_arrays` approach creates a read-only view of `numpy.asarray(f(x))` where every cell of the view aliases the same cell of the underlying array. This is only okay if you don't need to modify either the result array or the underlying array. (The one where you stuck a `copy` call on the end doesn't have this problem.) — user2357112, May 16 '19 at 19:00

How to force a function to broadcast without invoking `np.vectorize`

2 Answers2