0

After a projection I need to create a multidim array with contiguous diagonal values on base of a 1D array, e.g. with some kind of multiplication with a multidim identity array. The following is just a small example for the concept. As the real data from a projection result will be a lot larger than the following 12 values example input data I need an efficient way for the data handling. (Projection problem: The original "compact" axes cannot be used anymore but the data values are still the same)

Input:

>>> a=np.arange(1,13)
array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12.])

>>>a.shape
(12,)

Required Output:

array([[[ 1.,  0.,  0.,  0.],
        [ 0.,  2.,  0.,  0.],
        [ 0.,  0.,  3.,  0.],
        [ 0.,  0.,  0.,  4.]],

       [[ 5.,  0.,  0.,  0.],
        [ 0.,  6.,  0.,  0.],
        [ 0.,  0.,  7.,  0.],
        [ 0.,  0.,  0.,  8.]],

       [[ 9.,  0.,  0.,  0.],
        [ 0., 10.,  0.,  0.],
        [ 0.,  0., 11.,  0.],
        [ 0.,  0.,  0., 12.]]])
shape: (3, 4, 4)

I couldn't find a solution trying to use the following identity matrix:

>>> np.tile(np.identity(4),(3,1)).reshape(3,4,4)
array([[[1., 0., 0., 0.],
        [0., 1., 0., 0.],
        [0., 0., 1., 0.],
        [0., 0., 0., 1.]],

       [[1., 0., 0., 0.],
        [0., 1., 0., 0.],
        [0., 0., 1., 0.],
        [0., 0., 0., 1.]],

       [[1., 0., 0., 0.],
        [0., 1., 0., 0.],
        [0., 0., 1., 0.],
        [0., 0., 0., 1.]]])
Tobias
  • 33
  • 5
  • I was completely misled by the diagonal projection result of pyproj. I had needed to do a resampling if I was to continue with pyproj. A quicker way instead was to use a ready-to-use reprojection algorithm e.g. by rioxarray. An example reprojecting a netCDF file: https://corteva.github.io/rioxarray/stable/examples/reproject.html See https://stackoverflow.com/questions/76739879/reproject-geo-data-and-save-to-netcdf4/76741804 for details – Tobias Aug 09 '23 at 12:37

3 Answers3

2

While there are functions like np.diag to make diagonals, they are all aimed at 2d.

The following is a bit round about, but should work reasonably fast:

res=np.eye(4)[None].repeat(3,axis=0)
res[np.nonzero(res)]=np.arange(1,13)

repr(res) 
array([[[ 1.,  0.,  0.,  0.],
        [ 0.,  2.,  0.,  0.],
        [ 0.,  0.,  3.,  0.],
        [ 0.,  0.,  0.,  4.]],

       [[ 5.,  0.,  0.,  0.],
        [ 0.,  6.,  0.,  0.],
        [ 0.,  0.,  7.,  0.],
        [ 0.,  0.,  0.,  8.]],

       [[ 9.,  0.,  0.,  0.],
        [ 0., 10.,  0.,  0.],
        [ 0.,  0., 11.,  0.],
        [ 0.,  0.,  0., 12.]]])

Another is a bit of wild guess, but works:

In [123]: arr = np.diag(np.arange(1,13))
In [124]: arr.shape
Out[124]: (12, 12)
In [125]: arr.reshape(3,4,3,4)[np.arange(3),:,np.arange(3),:]
Out[125]: 
array([[[ 1,  0,  0,  0],
        [ 0,  2,  0,  0],
        [ 0,  0,  3,  0],
        [ 0,  0,  0,  4]],

       [[ 5,  0,  0,  0],
        [ 0,  6,  0,  0],
        [ 0,  0,  7,  0],
        [ 0,  0,  0,  8]],

       [[ 9,  0,  0,  0],
        [ 0, 10,  0,  0],
        [ 0,  0, 11,  0],
        [ 0,  0,  0, 12]]])
In [126]: _.shape
Out[126]: (3, 4, 4)

Mixing basic and advanced indexing like this can produce unexpected shapes, but here is works as I hoped.

I played with flat indexing and stride_tricks, but didn't get anywhere. Part of the problem is that this isn't a true diagonal.

Another way - make the target array, and index a (3,4) block:

In [134]: res = np.zeros((3,4,4))
In [135]: res[np.arange(3)[:,None], np.arange(4), np.arange(4)] 
= np.arange(1,13).reshape(3,4)
hpaulj
  • 221,503
  • 14
  • 230
  • 353
1

Just to add mine. And because I wanted to see if I could go somewhere using strides_trick :D

def mydiag(d, n, p):
    padded = np.zeros((n*p,n+1))
    padded[:,0]=d
    s1,s2=padded.strides
    return np.lib.stride_tricks.as_strided(padded, shape=(p,n,n), strides=(s1*n,s1-s2,s2))

Explanation: if you look at the flatten result, what you have are the diagonal values, separated by n zeros (n being the size of each square matrix, and p the number of those. So n×p=len(d)). Except when we change "plane", where the diagonal values are consecutive.

So I build an array with these diagonal values and n 0 in between (I could have done this with np.pad, but it is slower than just fill a np.zeros).

Then, I tweak strides, so that it shift one element for each line of each matrix.

Some timings for 4x3 case

| - | - | | Method | Timing | | hpaulj's 1st | 15.84 μs | | hpaulj's 2nd | 16.27 μs | | hpaulj's 3rd | 11.44 μs | | jared's | 25.87 μs | | Mine | 17.74 μs |

So clearly, for 3x4 case, that is not very interesting.

It becomes more intersting for bigger cases

Timing for 300x400 cases method
hpaulj's 1st 602 ms
hpaulj's 2nd crashes (memory)
hpaulj's 3rd 192 ms
jared's 347 ms
Mine 168 ms

Still even if this case, it is not crushing hpaulj's 3rd, which seems to be have the best timing or almost so, both in small and big cases. And in some run's the timing are inverted. So it isn't even a clear win for me, for big cases. And anyway, it is not a O(...) difference. Obviously, the bigger the size of the data is, the more negligible the difference becomes, because after a while, for both method, it is just the copy of the n×p values that cost.

But, well, it is a method using as_strided, and it is not worse than other, timing-wise.

(I don't think we can use as_strided while avoiding what cost me time here, that is the creation of padded matrix: we need those 0 from somewhere anyway, since as_strided can only rearrange how data are iterated, but cannot create those 0. It could of course "duplicate" them, but not without duplicating also non-zeros)

chrslg
  • 9,023
  • 5
  • 17
  • 31
  • While `as_strided` retuns a view, there's something in its consstruction that's time consuming. – hpaulj Jul 14 '23 at 18:50
  • @hpaulj Yes indeed. That is why, for the first times in my dozens of usage of `as_strided` here, I didn't brag about it returning just a view. But, again, I don't think that is avoidable. We have to bring all those zeros from somewhere. Even `np.diag` doesn't provide a view. – chrslg Jul 14 '23 at 21:04
0

Taking what you started with, we can multiply each of the inner arrays with the desired values. To do this, we need to reshape the multiplying array to a compatible shape.

import numpy as np

a = np.tile(np.identity(4),(3,1)).reshape(3,4,4)
b = np.arange(1,13).reshape(a.shape[:2])[:,None,:]
a *= b
print(a)

Output:

[[[ 1.  0.  0.  0.]
  [ 0.  2.  0.  0.]
  [ 0.  0.  3.  0.]
  [ 0.  0.  0.  4.]]

 [[ 5.  0.  0.  0.]
  [ 0.  6.  0.  0.]
  [ 0.  0.  7.  0.]
  [ 0.  0.  0.  8.]]

 [[ 9.  0.  0.  0.]
  [ 0. 10.  0.  0.]
  [ 0.  0. 11.  0.]
  [ 0.  0.  0. 12.]]]
jared
  • 4,165
  • 1
  • 8
  • 31