1

I am wondering what is the fastest way, given a shape (n, m) numpy array and a shape (p) numpy array describing a partition of the range 0 to n (for example, one such partition for n=6 would be: [0, 2, 4], meaning the indices are partitioned as (0, 1), (2,3), (4,5)), to return a numpy array of shape (p, m) of the rows corresponding to each partition summed together.

For example,

[[0,1,1,1],
 [2,0,1,1],
 [0,0,0,1],
 [5,1,0,0]]

given the partition [0,1] should return

[[0,1,1,1],
 [7,1,1,2]]

I already have a solution which is constructing the matrix

[[1,0,0,0],
 [0,1,1,1]]

and left multiplying the initial matrix by this to get the desired matrix, which I think should be pretty fast, but I think there might be something faster involving something similar to numpy.reduceat (https://docs.scipy.org/doc/numpy/reference/generated/numpy.ufunc.reduceat.html) using the partition array. Any help?

Wait... I just read the reduceat documentation and you can literally just do np.add.reduceat(matrix, partition, axis=0). I remember thinking you couldn't do this. I think this is because for my application, I needed to do this for a sparse matrix. So could anyone advise on how to do this when the input 2d numpy array is in sparse format?

ffffffyyyy
  • 117
  • 2
  • 7
  • 1
    Add [`mcve`](https://stackoverflow.com/help/mcve) for the sparse case ? – Divakar May 22 '18 at 07:08
  • Perhaps rephrase the question title/body to something like, alternatives to `np.ufunc.reduceat()` for a sparse matrix. And yeah clarify what you mean specifically by sparse. – eugenhu May 22 '18 at 07:08
  • I think your matrix multiplication solution is the way to go in the sparse case. You can get the other matrix as `sparse.block_diag(np.split(np.ones(n), partition[1:]))` or manually construct a csr. – Paul Panzer May 22 '18 at 08:21
  • Thank you @PaulPanzer, your way of constructing the matrix is very helpful. – ffffffyyyy Jun 13 '18 at 15:26

0 Answers0