numpy vectorize dimension increasing function

Question

I would like to create a function that has input: x.shape==(2,2), and outputs y.shape==(2,2,3).

For example:

@np.vectorize
def foo(x):
  #This function doesn't work like I want
  return x,x,x

a = np.array([[1,2],[3,4]])
print(foo(a))
#desired output
[[[1 1 1]
  [2 2 2]]

 [[3 3 3]
  [4 4 4]]]

#actual output
(array([[1, 2],
   [3, 4]]), array([[1, 2],
   [3, 4]]), array([[1, 2],
   [3, 4]]))

Or maybe:

@np.vectorize
def bar(x):
  #This function doesn't work like I want
  return np.array([x,2*x,5])

a = np.array([[1,2],[3,4]])
print(bar(a))
#desired output
[[[1 2 5]
  [2 4 5]]

 [[3 6 5]
  [4 8 5]]]

Note that foo is just an example. I want a way to map over a numpy array (which is what vectorize is supposed to do), but have that map take a 0d object and shove a 1d object in its place. It also seems to me that the dimensions here are arbitrary, as one might wish to take a function that takes a 1d object and returns a 3d object, vectorize it, call it on a 5d object, and get back a 7d object.... However, my specific use case only requires vectorizing a 0d to 1d function, and mapping it appropriately over a 2d array.

Can you elaborate how this 1d-objects is created? What does it depend on? — ascripter, Jan 26 '18 at 16:37
via the function `foo`. I can include more examples of a function that takes a 0d argument and returns a 1d object.... — Him, Jan 26 '18 at 17:15
Is the question yet clear? Why is this a bad question? Is mapping over numpy arrays not something that other people want to do? — Him, Jan 26 '18 at 17:21
Well, you might as well write your own loop. `np.vectorize` isn't really performant anyway, so if it's not doing what you want just do the iteration yourself. — juanpa.arrivillaga, Jan 26 '18 at 17:47
@juanpa.arrivillaga This is known to me, however, looping isn't very numpythonic :), and anyhow, I'm not tied specifically to the `vectorize` function. I just want a way to perform the mapping operation that I've described. — Him, Jan 26 '18 at 17:48
`np.vectorize`, in its default mode, passes scalar elements from the arguments to your function. There is a way of specifying dimensions, but it's more complicated. Either way it's at best a convenience function, not a way of speeding up your code. — hpaulj, Jan 26 '18 at 17:49
If you are just worried about 'numpythonic' just hide your loops in a function call :) — hpaulj, Jan 26 '18 at 17:52
Take a look [here](https://stackoverflow.com/questions/35215161/most-efficient-way-to-map-function-over-numpy-array). There is no standard way to map over a numpy array, even in the simple case of a single dimension. Usually, the best answer is to translate into vectorized ufuncs. You can always write your own, using C or Cython. Also, there is the JITed route using `numba`. I've found it is a good tool for writing more complex operations in a space-efficient way that also maintains speed without the need to delve into C-extensions. — juanpa.arrivillaga, Jan 26 '18 at 18:24

hpaulj · Accepted Answer · 2018-01-26T19:11:43.833

It would help, in your question, to show both the actual result and your desired result. As written that isn't very clear.

In [79]: foo(np.array([[1,2],[3,4]]))
Out[79]: 
(array([[1, 2],
        [3, 4]]), array([[1, 2],
        [3, 4]]), array([[1, 2],
        [3, 4]]))

As indicated in the vectorize docs, this has returned a tuple of arrays, corresponding to the tuple of values that your function returned.

Your bar returns an array, where as vectorize expected it to return a scalar (or single value):

In [82]: bar(np.array([[1,2],[3,4]]))
ValueError: setting an array element with a sequence.

vectorize takes an otypes parameter that sometimes helps. For example if I say that bar (without the wrapper) returns an object, I get:

In [84]: f=np.vectorize(bar, otypes=[object])

In [85]: f(np.array([[1,2],[3,4]]))
Out[85]: 
array([[array([1, 2, 5]), array([2, 4, 5])],
       [array([3, 6, 5]), array([4, 8, 5])]], dtype=object)

A (2,2) array of (3,) arrays. The (2,2) shape matches the shape of the input.

vectorize has a relatively new parameter, signature

In [90]: f=np.vectorize(bar, signature='()->(n)')

In [91]: f(np.array([[1,2],[3,4]]))
Out[91]: 
array([[[1, 2, 5],
        [2, 4, 5]],

       [[3, 6, 5],
        [4, 8, 5]]])
In [92]: _.shape
Out[92]: (2, 2, 3)

I haven't used this much, so am still getting a feel for how it works. When I've tested it, it is slower than the original scalar version of vectorize. Neither offers any speed advantage of explicit loops. However vectorize does help when 'broadcasting', allowing you to use a variety of input shapes. That's even more useful when your function takes several inputs, not just one as in this case.

In [94]: f(np.array([1,2]))
Out[94]: 
array([[1, 2, 5],
       [2, 4, 5]])

In [95]: f(np.array(3))
Out[95]: array([3, 6, 5])

For best speed, you want to use existing numpy whole-array functions where possible. For example your foo case can be done with:

In [97]: np.repeat(a[:,:,None],3, axis=2)
Out[97]: 
array([[[1, 1, 1],
        [2, 2, 2]],

       [[3, 3, 3],
        [4, 4, 4]]])

np.stack([a]*3, axis=2) also works.

And your bar desired result:

In [100]: np.stack([a, 2*a, np.full(a.shape, 5)], axis=2)
Out[100]: 
array([[[1, 2, 5],
        [2, 4, 5]],

       [[3, 6, 5],
        [4, 8, 5]]])

2*a takes advantage of the whole-array multiplication. That's true 'numpy-onic' thinking.

I timeit'd this last thing, `np.stack`..., the solution in my answer takes 5 times as long. — Him, Jan 29 '18 at 15:17

score 2 · Answer 2 · answered Jan 26 '18 at 16:06

2

Just repeating the value into another dimension is quite simple:

import numpy as np

x = a = np.array([[1,2],[3,4]])
y = np.repeat(x[:,:,np.newaxis], 3, axis=2)
print y.shape
print y


(2L, 2L, 3L)
[[[1 1 1]
  [2 2 2]]

 [[3 3 3]
  [4 4 4]]]

answered Jan 26 '18 at 16:06

ascripter

5,665
12
45
68

I do not need to simply repeat the value into another dimension, that was merely an example... – Him Jan 26 '18 at 16:23
I'll think of it as soon as I find the time. It's an interesting problem. – ascripter Jan 26 '18 at 17:40

score 0 · Answer 3 · answered Jan 26 '18 at 18:35

0

This seems to work for the "f R0 -> R1 mapped over a nd array giving a (n+1)d one"

def foo(x):
  return np.concatenate((x,x))
np.apply_along_axis(foo,2,x.reshape(list(x.shape)+[1]))

doesn't generalize all that well, though

answered Jan 26 '18 at 18:35

Him

5,257
3
26
83

1

This apply function just hides a loop. – hpaulj Jan 26 '18 at 18:40
1

"hides a loop": hides a python loop, or hides a machine language loop? AFAIK, a map operation essentially must boil down to "1) operator, memory increment, check for end, if not, goto 1)" – Him Jan 28 '18 at 21:43
1

Click on the [source] link on the documentation page, https://github.com/numpy/numpy/blob/v1.14.0/numpy/lib/shape_base.py#L23-L167. Note the use of `ndindex` and the `for ind in inds:` loop. – hpaulj Jan 28 '18 at 22:34
numpy isn't numpythonic. :) – Him Jan 29 '18 at 15:02

numpy vectorize dimension increasing function

3 Answers3