Fast and efficient slice of array avoiding delete operation

Question

I am trying to get a slice (for example elements 1-3 and 5-N) of an array A(N,3) avoiding using numpy.delete. And example of the process will be the following:

 [[1,2,3],[4,5,6],[7,8,9],[3,2,1]] ==>  [[1,2,3],[3,2,1]]

I was hoping to use something like

A[A != [1,2,3] ].reshape()

But that performs an element-wise comparison and thus removes more elements than I wanted to. How does one do it? I came up with this idea but seems too complex and slow:

A_removed = A[first_removed:last:removed,:] 
mask      = np.not_equal(A[:,None],A_removed)
mask      = np.logical_and.reduce(mask,1)
A         = A[mask].reshape()

Is there a way of doing it in a faster/cleaner way?

PD the asumption that any two elements of A can't be equal always holds

Could you explain in words how you would get that expected output? — Divakar, Apr 19 '18 at 22:01
Perhaps - `A[~(A != [1,2,3]).all(1)]` or `A[(A == [1,2,3]).any(1)]` going by that complex code. — Divakar, Apr 19 '18 at 22:08
Is your problem to match the given values in *any* order? If so, I recommend that you use either set operations, or sort the values in a temporary variable to compare. — Prune, Apr 19 '18 at 22:10
Honestly, this sounds like a job for `numpy.delete`. You're unlikely to do this any faster than `numpy.delete` can. The most likely optimization path is probably restructuring your computation to eliminate this operation, rather than making this operation faster. — user2357112, Apr 19 '18 at 22:19

score 1 · Answer 1 · edited Jun 20 '20 at 09:12

Edit

Rereading the question, I'm now pretty sure the OP wanted the inverse of what I originally posted. Here's how you get that:

import numpy as np

def selectRow(arr, selrow):
    selset = set(selrow)
    return np.array([row for row in arr if selset == set(row)])
    
arr = np.array([
    [1,2,3],
    [4,5,6],
    [7,8,9],
    [3,2,1]
])

selectRow(arr, [1,2,3])

Output:

array([[1, 2, 3],
       [3, 2, 1]])

I'll leave the original answer up for the moment, just in case I'm wrong.

Original answer

ordered version

How about just:

import numpy as np

def withoutRow(arr, badrow):
    return np.array([row for row in arr if not np.array_equal(row, badrow)])

which you would then use as so:

arr = np.array([
    [1,2,3],
    [4,5,6],
    [7,8,9],
    [3,2,1]
])

withoutRow(arr, [1,2,3])

Output:

array([[4, 5, 6],
       [7, 8, 9],
       [3, 2, 1]])

withoutRow should be fairly efficient (especially when compared to boolean indexing), since there's only a single loop (over the rows of the original array), and you only have to construct a single new array (the return value).

unordered version

If you want to remove any point with matching coordinates without regard to the order of the coordinates, you could instead use:

def withoutRowUnordered(arr, badrow):
    badset = set(badrow)
    return np.array([row for row in arr if badset != set(row)])

withoutRowUnordered(arr, [1,2,3])

Output:

array([[4, 5, 6],
       [7, 8, 9]])

Fast and efficient slice of array avoiding delete operation

1 Answers1

Edit

Original answer

ordered version

unordered version