I need to do integer array indexing of a continuous integer range [0, n)
which has values that have to be always ignored.
The ignored values should not appear in the result.
And there is a stand-alone NumPy boolean array of length n
(i.e., a mask) that indicates whether an element of the original range is ignored or not.
In a pure Python, I would write like this:
def get_non_masked_indices(range_mask, indices):
return [i for i in indices if not range_mask[i]]
For this input
# 0 1 2 3 4 5 6 7 8 9
mask = np.array([0, 1, 0, 1, 0, 0, 1, 1, 1, 0], dtype=np.bool)
idxs = np.array([ 2, 3, 4, 7, 9])
# + - + - +
the result of invoking get_non_masked_indices(mask, idxs)
would be
[2, 4, 9]
This is a frequently used array-processing pattern (especially in graph algorithms). Is there a NumPy function to facilitate that?
So far, I have come with the following options:
- Native NumPy indexing
- Masking with an indexed mask
- Indexing a masked range
Native NumPy indexing:
return indices[np.logical_not(range_mask[indices])]
Masking with an indexed mask:
return np.ma.MaskedArray(indices, range_mask[indices]).compressed()
return np.ma.masked_where(range_mask[indices], indices).compressed()
Indexing a masked range:
return np.ma.MaskedArray(np.arange(len(range_mask)), range_mask)[indices].compressed()
return np.ma.masked_where(range_mask, np.arange(len(range_mask)))[indices].compressed()
An example from an application
Assume we have a graph represented as a list of NumPy arrays of adjacent nodes.
adjacent_nodes = [
np.array([1, 2]),
np.array([0]),
np.array([0]),
]
is_colored = np.array([False, False, True])
The function of my interest needs to return only non-colored neighbors of a node:
get_non_masked_indices(is_colored, adjacent_nodes[0]) # -> [1]
get_non_masked_indices(is_colored, adjacent_nodes[1]) # -> [0]
get_non_masked_indices(is_colored, adjacent_nodes[2]) # -> [0]