0

I have list of tuples/lists (-1, 0, 1) (-1, 1, 0) (-1, 2, -1) (-1, -1, 2) (0, 1, -1)

I need them to be : (-1, 1, 0) (-1, 2, -1)

I want (-1, 0, 1) and (-1, 1, 0) map to the same thing. I thought of something like set but that would remove any duplicates I might have in the tuple.

While generating a new tuple say (-1,-1,2) I want to perform a check like

if (-1,-1,2) in seen:
   pass
else:
     insert(seen, (-1,-1,2))

for this I need the data structure to be hashable for O(1) lookup. Any ideas how I would implement this in Python?

  • 2
    You could sort the elements of the tuple, if I understand what you are asking. – Scott Hunter Mar 20 '19 at 18:27
  • 2
    Why are `(-1, 0, 1)` and `(-1, 1, 0)` the same, because they have the same values, but are not ordered? – Martijn Pieters Mar 20 '19 at 18:28
  • It's probably impossible to do in `O(1)` because converting the tuple and comparing it is always at least `O(n)` (average case) where `n` is the number of elements in the tuple. – MSeifert Mar 20 '19 at 18:43
  • @MSeifert well, but it depends what N is, right? If size of the tuples doesn't vary, then this won't affect the complexity. IOW, this will still scale linearly on *the number of tuples* You just have a higher constant factor due to what amounts to a costly hash function – juanpa.arrivillaga Mar 20 '19 at 18:44
  • @juanpa.arrivillaga Yeah, however that's only implicit in the question. Would be nice if that were clarified. :) – MSeifert Mar 20 '19 at 18:45

4 Answers4

1

FrozenMultiset from the multiset package does what you want.

Abstractly, a multiset is unordered and "allows" duplicates. Some implementations of multisets are unhashable, but not FrozenMultiset, fortunately.

0

You could sort the tuples and use set to check for duplicates as tuples are hashable

a=[(-1, 0, 1) ,(-1, 1, 0), (-1, 2, -1) ,(-1, -1, 2), (0, 1, -1)]
my_set=set()
res=[]
for original_value, sorted_value in zip(a,map(sorted,a)):
    if tuple(sorted_value) not in my_set:
        res.append(original_value)
        my_set.add(tuple(sorted_value))

Output

[(-1, 0, 1), (-1, 2, -1)]

Can use defaultdict

from collections import defaultdict
d=defaultdict(list)
a=[(-1, 0, 1) ,(-1, 1, 0), (-1, 2, -1) ,(-1, -1, 2), (0, 1, -1)]

res=[]
for original_value, sorted_value in zip(a,map(sorted,a)):
    d[tuple(sorted_value)].append(original_value)

Output:

{
(-1, -1, 2): [(-1, 2, -1), (-1, -1, 2)], 
(-1, 0, 1): [(-1, 0, 1), (-1, 1, 0), (0, 1, -1)]
}
mad_
  • 8,121
  • 2
  • 25
  • 40
  • The question title says "no order" - but it fulfills the functional requirements. – MSeifert Mar 20 '19 at 18:41
  • Ahh didn't read the question. O(1) is hard to achieve as there are chances of having higher collision. – mad_ Mar 20 '19 at 18:44
0

You can use set to avoid adding elements that map to the same thing.

l = [(-1, 0, 1), (-1, 1, 0), (-1, 2, -1), (-1, -1, 2), (0, 1, -1)]

new_l = []

for i in l:
    if set(i) not in [set(j) for j in new_l]:
        new_l += [i]

print new_l

It returns [(-1, 0, 1), (-1, 2, -1)]

Edit

This incorrectly flags some tuples as duplicates. This should work :

l = [(-1, 0, 1), (-1, 1, 0), (-1, 2, -1), (-1, -1, 2), (0, 1, -1)]

new_l = list(set([tuple(sorted(i)) for i in l]))

print new_l
0

You can use collections.Counter to efficiently take the signatures of each tuple in your list, map the items of the Counter objects to frozensets so the signatures become hashable, put them in a set to de-duplicate, and then re-create the tuples using the Counter.elements() method:

from collections import Counter
l = [(-1, 0, 1), (-1, 1, 0), (-1, 2, -1), (-1, -1, 2), (0, 1, -1)]
[tuple(Counter(dict(i)).elements()) for i in {frozenset(Counter(t).items()) for t in l}]

This returns:

[(0, -1, 1), (-1, -1, 2)]
blhsing
  • 91,368
  • 6
  • 71
  • 106
  • The result is wrong, expected was `(-1, 1, 0) (-1, 2, -1)`. – MSeifert Mar 20 '19 at 19:00
  • The order of the items within the tuple does not matter as stated by the OP, and is exactly why differently ordered tuples are considered duplicates in the first place. – blhsing Mar 20 '19 at 19:02
  • Ah, I thought that was just the requirement for the comparison, not for the result. It's a bit unclear though – MSeifert Mar 20 '19 at 19:03
  • Well, if the OP wants to keep the first tuple of each unique combination of items in the result, the expected output would've been `(-1, 0, 1), (-1, 2, -1)`. To me the fact that the OP expects `(-1, 1, 0)` to be in the output implies that the order doesn't really matter. But yes, some clarification from the OP would be nice here. – blhsing Mar 20 '19 at 19:10