Is there a way to broadcast boolean masks?

Question

I'm trying to reduce the number of calculations I do based on search distance. I have N nodes and an [NxN] boolean mask that tells me what nodes are within X distance of the other nodes with T true values.

I also have [Nx(d)] data for each node, where (d) can be (1), (3), or (3x3). I want the "sparse" format which is a [Tx(d)] array so I can do vectorized calculations along the 0 axis. Right now I do this:

sparseData=data.repeat(data.shape[0],axis=0).reshape(np.concatenate(([data.shape[0],data.shape])))[mask]

Which works, but causes memory errors if N is too big, due to the [NxNx(d)] array I'm creating with .repeat Is there a way to broadcast this? If I do this:

data[None,...][mask]

It doesn't work, but it seems like there has to be a more efficient way to do this.

score 4 · Accepted Answer · edited Aug 19 '23 at 09:47

4

Instead of repeating the data you can make a view with numpy.broadcast_to:

sparseData = np.broadcast_to(data, (data.shape[0],) + data.shape)[mask]

However, even easier would be to select the rows of data based on index:

I, J = np.nonzero(mask)
sparseData = data[I]  # could also use J

edited Aug 19 '23 at 09:47

Mateen Ulhaq

24,552
19
101
135

answered Jan 19 '17 at 10:39

user7138814

1,991
9
11

Thanks, I just figured out an answer based on `np.where(mask)[1]` which is equivalent to your second answer. Sometimes just writing out the question brings answers to mind. – Daniel F Jan 19 '17 at 10:47

Is there a way to broadcast boolean masks?

1 Answers1