1

I am trying to understand the signature functionality in numpy.vectorize. I have some examples but did not help much in the understanding.

>>import scipy.stats
>>pearsonr = np.vectorize(scipy.stats.pearsonr, signature='(n),(n)->(),()')
>>pearsonr([[0, 1, 2, 3]], [[1, 2, 3, 4], [4, 3, 2, 1]])
(array([ 1., -1.]), array([ 0.,  0.]))

>>convolve = np.vectorize(np.convolve, signature='(n),(m)->(k)')
>>convolve(np.eye(4), [1, 2, 1])
array([[1., 2., 1., 0., 0., 0.],
       [0., 1., 2., 1., 0., 0.],
       [0., 0., 1., 2., 1., 0.],
       [0., 0., 0., 1., 2., 1.]])

>>>import numpy as np
>>>qr = np.vectorize(np.linalg.qr, signature='(m,n)->(m,k),(k,n)')
>>>qr(np.random.normal(size=(1, 3, 2)))
(array([[-0.31622777, -0.9486833 ],
       [-0.9486833 ,  0.31622777]]), 
array([[-3.16227766, -4.42718872, -5.69209979],
       [ 0.        , -0.63245553, -1.26491106]]))

>>>import scipy
>>>logm = np.vectorize(scipy.linalg.logm, signature='(m,m)->(m,m)')
>>>logm(np.random.normal(size=(1, 3, 2)))
array([[[ 1.08226288, -2.29544602],
        [ 2.12599894, -1.26335203]]])

Can you please someone explain the functionality-syntax of the signatures

signature='(n),(n)->(),()'
signature='(n),(m)->(k)'
signature='(m,n)->(m,k),(k,n)'
signature='(m,m)->(m,m)'

used in the aforementioned examples? If we didn't use the signatures, how the examples would have been implemented in a more easy-naive way?

Any help is highly appreciated.

The aforementioned examples can be found here and here.

Darkmoor
  • 862
  • 11
  • 29
  • what's the signatures of the three functions used these examples? – hpaulj Nov 27 '22 at 13:03
  • @hpaulj thanks for the response. I do not catch your question. Could you please provide some more details? – Darkmoor Nov 27 '22 at 14:10
  • I did some edits on the post. – Darkmoor Nov 27 '22 at 14:22
  • for example what does `scipy.stats.pearsonr` accept and produce? I could look it up, but would prefer if you did the work. – hpaulj Nov 27 '22 at 15:31
  • Thanks again for the response. As I can see, `scipy.stats.pearsonr` take as input two arrays of the same dimension. So `'(n),(n)->(),()'` I suppose means that the inputs has to be of the same dimension `n` and produce two scalars, the correlation coefficient and p-value. Thus, the first and second element of the output `(array([ 1., -1.]), array([ 0., 0.]))`, refers to `pearsonr ([[0, 1, 2, 3]], [[1, 2, 3, 4]])` and `pearsonr ([[0, 1, 2, 3]], [[4, 3, 2, 1]])`, respectively. Am I right? – Darkmoor Nov 27 '22 at 16:46
  • Yes that's right. – hpaulj Nov 27 '22 at 16:58

1 Answers1

2

I think the explanation would be clearer if we knew the 'signature' of the individual functions - what they expect, and what they produce. But I can make some deductions from the code you show.

>>pearsonr = np.vectorize(scipy.stats.pearsonr, signature='(n),(n)->(),()')
>>pearsonr([[0, 1, 2, 3]], [[1, 2, 3, 4], [4, 3, 2, 1]])
(array([ 1., -1.]), array([ 0.,  0.]))

This is called with a (4,) and (2,4) arrays (well, lists that become such arrays). They broadcast together to (2,4). The stats function is then called twice, once for each row of the pair, getting two (4,) arrays, and returning 2 scalar values (maybe the mean and std?)

>>convolve = np.vectorize(np.convolve, signature='(n),(m)->(k)')
>>convolve(np.eye(4), [1, 2, 1])
array([[1., 2., 1., 0., 0., 0.],
       [0., 1., 2., 1., 0., 0.],
       [0., 0., 1., 2., 1., 0.],
       [0., 0., 0., 1., 2., 1.]])

This called with (4,4) and (3,) arrays. I think convolve gets called 4 times, once for each row of the eye, and getting the same [1,2,1] each time. The result is a 4 row array (with 6 columns - determined by convolve itself, not vectorize.

>>>import numpy as np
>>>qr = np.vectorize(np.linalg.qr, signature='(m,n)->(m,k),(k,n)')
>>>qr(np.random.normal(size=(1, 3, 2)))
(array([[-0.31622777, -0.9486833 ],
       [-0.9486833 ,  0.31622777]]), 
array([[-3.16227766, -4.42718872, -5.69209979],
       [ 0.        , -0.63245553, -1.26491106]]))

Signature: np.linalg.qr(a, mode='reduced') a : array_like, shape (M, N)

  • 'reduced' : returns q, r with dimensions (M, K), (K, N) (default)

vectorize signature just repeats the information in the docs.

a is (1,3,2) shape array; so qr is called once (1st dimension), with a (3,2) array. The result is 2 arrays, (2,k) and (k,3) shapes. When I run it I get an added size 1 dimension (1,2,3) and (1,2,2). Different numbers because of random:

In [120]: qr = np.vectorize(np.linalg.qr, signature='(m,n)->(m,k),(k,n)')
     ...: qr(np.random.normal(size=(1, 3,2)))
Out[120]: 
(array([[[-0.61362528,  0.09161174],
         [ 0.63682861, -0.52978942],
         [-0.46681188, -0.84316692]]]),
 array([[[-0.65301725, -1.00494992],
         [ 0.        ,  0.8068886 ]]]))
    
>>>import scipy
>>> logm = np.vectorize(scipy.linalg.logm, signature='(m,m)->(m,m)')
>>>logm(np.random.normal(size=(1, 3, 2)))
array([[[ 1.08226288, -2.29544602],
        [ 2.12599894, -1.26335203]]])

scipy.linalg.logm expects square array, and returns the same.

Calling logm with a (1,3,2) produces an error, because (3,2) is not a square array:

ValueError: inconsistent size for core dimension 'm': 2 vs 3

Calling scipy.linalg.logm directly produces the same error, worded differently:

linalg.logm(np.random.normal(size=(3, 2)))
ValueError: expected square array_like input

When I say the function is called twice, or something like that, I'm ignoring the test call that's used to determine the return dtype.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • Thank you very much for the answer. I really appreciate it. I think I am better now with your explanations. Maybe there are some dimension typos in the `qr` and `logm` examples. So `np.vectorize` applies a function like using a for loop and a signature is need to specify the dimension-direction in which the vectorization-broadcasting is going to be applied? – Darkmoor Nov 27 '22 at 18:30
  • Any comments please before closing the post? – Darkmoor Nov 28 '22 at 10:39