0

I have the following function:

def fun1(a):
    b=a[0]+a[1]
    return(b)

I want to vectorize it using:

fun2 = np.vectorize(fun1,signature='(n,m)->(n,1)')

the input

input=np.array([
[1 ,2],
[3, 4],
[5, 6]])

I need the output like:

fun2(input)=
np.array([[3],[7],[11]])

I know I can do it using np.sum(input,axis=1) but I am trying to understand the signature so can you help me with the signature?

Edit: I know the function is so simple and it does not need any vectorization but if I am unable to vectorize a simple function I won't be able to vectorize any complex function

understand the signature

Jamiu S.
  • 5,257
  • 5
  • 12
  • 34
  • Did you read the docs? What are you confused about specifically? https://numpy.org/doc/stable/reference/c-api/generalized-ufuncs.html – Pranav Hosangadi Dec 06 '22 at 14:24
  • The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop. You'd be better off using `np.sum` – Pranav Hosangadi Dec 06 '22 at 14:26
  • your function does `b=a[0]+a[1]`, but then you `return(a)`. you are returning the same parameter you passed and ignoring the calculation. Do you mean `return b`? – Sembei Norimaki Dec 06 '22 at 14:26
  • `With `signature` it's even slower. `vectorize` is somewhat useful when applied to a function with several inputs, and you want to take advantage of broadcasting. With just one argument it's an unnecessary layer. – hpaulj Dec 06 '22 at 15:24
  • My long answer from a few weeks ago , https://stackoverflow.com/questions/74589308/trying-to-understand-signature-in-numpy-vectorize#74591997 – hpaulj Dec 06 '22 at 15:44
  • In another recent SO I claimed it would be better to 'vectorize' a cover like `lambda a1,a2: fun1((a1,a2))`, which needs no signature. – hpaulj Dec 06 '22 at 15:49
  • What's wrong with the pure python `[fun1(a) for a in input]`? – hpaulj Dec 06 '22 at 17:37

1 Answers1

1
In [187]: def fun1(a):
     ...:     b=a[0]+a[1]
     ...:     return(b)
     ...: 
     ...: fun2 = np.vectorize(fun1,signature='(n)->()')
     ...: 
     ...: input=np.array([
     ...: [1 ,2],
     ...: [3, 4],
     ...: [5, 6]])

In [188]: fun2(input)
Out[188]: array([ 3,  7, 11])

b is a scalar, so we can't force the output to be (3,1) shape. You'll have to settle for the (3,)

With your attempt the full traceback (which YOU should have posted) is:

In [191]: fun2 = np.vectorize(fun1,signature='(n,m)->(n,1)')

In [192]: fun2(input)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [192], in <cell line: 1>()
----> 1 fun2(input)

File ~\anaconda3\lib\site-packages\numpy\lib\function_base.py:2163, in vectorize.__call__(self, *args, **kwargs)
   2160     vargs = [args[_i] for _i in inds]
   2161     vargs.extend([kwargs[_n] for _n in names])
-> 2163 return self._vectorize_call(func=func, args=vargs)

File ~\anaconda3\lib\site-packages\numpy\lib\function_base.py:2237, in vectorize._vectorize_call(self, func, args)
   2235 """Vectorized call to `func` over positional `args`."""
   2236 if self.signature is not None:
-> 2237     res = self._vectorize_call_with_signature(func, args)
   2238 elif not args:
   2239     res = func()

File ~\anaconda3\lib\site-packages\numpy\lib\function_base.py:2291, in vectorize._vectorize_call_with_signature(self, func, args)
   2289 if outputs is None:
   2290     for result, core_dims in zip(results, output_core_dims):
-> 2291         _update_dim_sizes(dim_sizes, result, core_dims)
   2293     if otypes is None:
   2294         otypes = [asarray(result).dtype for result in results]

File ~\anaconda3\lib\site-packages\numpy\lib\function_base.py:1892, in _update_dim_sizes(dim_sizes, arg, core_dims)
   1890 num_core_dims = len(core_dims)
   1891 if arg.ndim < num_core_dims:
-> 1892     raise ValueError(
   1893         '%d-dimensional argument does not have enough '
   1894         'dimensions for all core dimensions %r'
   1895         % (arg.ndim, core_dims))
   1897 core_shape = arg.shape[-num_core_dims:]
   1898 for dim, size in zip(core_dims, core_shape):

ValueError: 1-dimensional argument does not have enough dimensions for all core dimensions ('n', '1')

It can't force the b to 2d. We can do:

In [193]: fun2 = np.vectorize(fun1,signature='(n,m)->(1)')

In [194]: fun2(input)
Out[194]: array([4, 6])

If I add a print(a) to fun1 we see it just passes the whole 2 array to fun1, so the effect is simply adding the first 2 rows:

In [196]: fun2(input)
[[1 2]
 [3 4]
 [5 6]]
Out[196]: array([4, 6])

In [197]: input[0]+input[1]
Out[197]: array([4, 6])

See also

Trying to understand signature in numpy.vectorize

With a corrected fun1:

In [202]: def fun1(a):
     ...:     print(a)
     ...:     b=a[0]+a[1]
     ...:     return np.array([b])
     ...: 
     ...: fun2 = np.vectorize(fun1,signature='(n)->(m)')

In [203]: fun2(input)
[1 2]
[3 4]
[5 6]
Out[203]: 
array([[ 3],
       [ 7],
       [11]])

signature has to match what the function produces, not what you want the result to be. And pay close attention to what vectorize passes to your function. Guesses and wishes don't count!

Another way of 'vectorizing' this function:

In [205]: np.vectorize(lambda a1,a2: fun1((a1,a2)), otypes=[int])(input[:,0],input[:,1])
(1, 2)
(3, 4)
(5, 6)
Out[205]: array([ 3,  7, 11])

Now you can even pass (3,1) and (3) arrays and get (3,3):

In [206]: np.vectorize(lambda a1,a2: fun1((a1,a2)), otypes=[int])(input[:,0,None],input[:,1])
(1, 2)
(1, 4)
(1, 6)
(3, 2)
(3, 4)
(3, 6)
(5, 2)
(5, 4)
(5, 6)
Out[206]: 
array([[ 3,  5,  7],
       [ 5,  7,  9],
       [ 7,  9, 11]])

And without the vectorize baggage:

In [207]: input[:,0,None]+input[:,1]
Out[207]: 
array([[ 3,  5,  7],
       [ 5,  7,  9],
       [ 7,  9, 11]])
hpaulj
  • 221,503
  • 14
  • 230
  • 353