Subtract elements from one list from another list, using list comprehension. Returns incomplete list?

Question

I have an array l1 of size (81x2), and another l2 of size (8x2). All elements of l2 are also contained in l1. I'm trying to generate an array l3 of size (73x2) containing all elements of l1 minus the ones in l2 ( ==> l3 = l1 - l2 ), but using list comprehension.

I found many similar questions on here, and almost all agree on a solution like this to generate l3:

n = 9    
index = np.arange(n)   
 
l1 = np.array([(i,j) for i in index for j in index])
l2 = np.array([(0, 3),(0, 5),(2, 4),(4, 4),(4, 2),(4, 6),(8, 3),(8, 5)])
l3 = [(i,j) for (i,j) in l1 if (i,j) not in l2]

print(l3)

However, the code above generates an array l3 that only contains 20 of the expected (81-8=) 73 elements. I don't understand how list comprehension operates here or why only those particular 20 elements are kept. Can anyone help?

NOTE: many people advise using set() instead of list comprehension for this problem, but I haven't tried that yet and I'd really like to understand why list comprehension is failing in the code above.

Are the elements in each unique? If so `np.array(list(set(zip(*l1.T.tolist())).difference(zip(*l2.T.tolist()))))` — Onyambu, Apr 28 '22 at 20:50

hpaulj · Answer 1 · 2022-04-28T21:11:27.743

Let's test the first row of l1:

In [46]: i,j = l1[0]
In [47]: i,j
Out[47]: (0, 0)
In [48]: (i,j) in l2
Out[48]: True

It's True because 0 occurs in l2. It isn't testing by rows.

There isn't a 7 in l2, so this is False

In [49]: (7,7) in l2
Out[49]: False

Make sure your list comprehension test works.

One way to test for matches is:

In [72]: x = (l1==l2[:,None,:]).all(axis=2).any(axis=0)
In [73]: x
Out[73]: 
array([False, False, False,  True, False,  True, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False,  True, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False,  True, False,  True, False,  True, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False,  True, False,  True, False, False, False])

This has 8 True values, the ones that exactly match l2:

In [74]: x.sum()
Out[74]: 8
In [75]: l1[x]
Out[75]: 
array([[0, 3],
       [0, 5],
       [2, 4],
       [4, 2],
       [4, 4],
       [4, 6],
       [8, 3],
       [8, 5]])

So the rest would be accessed with:

In [76]: l1[~x]

TO work with sets, we need to convert the arrays to lists of tuples

In [85]: s1 = set([tuple(x) for x in l1])
In [86]: s2 = set([tuple(x) for x in l2])
In [87]: len(s1.difference(s2))
Out[87]: 73

Another approach is to convert the arrays to structured arrays:

In [88]: import np.lib.recfunctions as rf
In [102]: r1 = rf.unstructured_to_structured(l1,dtype=np.dtype('i,i'))
In [103]: r2 = rf.unstructured_to_structured(l2,dtype=np.dtype('i,i'))
In [104]: r2
Out[104]: 
array([(0, 3), (0, 5), (2, 4), (4, 4), (4, 2), (4, 6), (8, 3), (8, 5)],
      dtype=[('f0', '<i4'), ('f1', '<i4')])

Now isin works - the arrays are both 1d, as required by isin:

In [105]: np.isin(r1,r2)
Out[105]: 
array([False, False, False,  True, False,  True, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False,  True, False, False, False, False,
       ...])

I couldn't have asked for a more complete answer, thank you! — Qosa, Apr 29 '22 at 08:49
I just have a question about this line: In [72]: x = (l1==l2[:,None,:]).all(axis=2).any(axis=0). How come l2 suddenly has 3 dimensions instead of 2? — Qosa, Apr 29 '22 at 08:58

Subtract elements from one list from another list, using list comprehension. Returns incomplete list?

1 Answers1