Remove any duplicates in a list of lists when order doesn't matter (eg. [1,2,3] and [1,3,2] are duplicate sets)?

Question

I've been deleting all occurrences unintentionally. I would like to keep at least one set of occurrences.

For example, I have [[1,2,3],[1,3,2],[4,5,6],[5,6,4]] and the desired output would be akin to [[1,2,3],[4,5,6]].

s = 1,2,3,4,5,6
c = [[1,2,3],[4,5,6],[4,6,5]]

remove_sets = []
for a in range(0, len(c)):
    for b in permutations(c[a], 3):
        # my idea is that if list(b) != c[a]
        # it should not delete all occurences.
        if list(b) != c[a]:
            if list(b) in c:
                remove_sets.append(list(b))

# delete those occurences.
for cc in range(0, len(remove_sets)):
    if remove_sets[cc] in c:
        del c[c.index(remove_sets[cc])]

Unintended Result/Output

[[1,2,3]]

My desired output would be

[[1,2,3],[4,5,6]]

Question

Is there a function for removing these duplicate sets where order is switched around?

I'm sticking to Fixed Three Elements! Perhaps the set comparison would be more efficient. I'm not sure how it would be done! — Travis Wells, Jun 03 '20 at 01:51
Your text and title don't seem to match. Your text seems to want to ignore order but your title says order matters. — ggorlen, Jun 03 '20 at 01:59
Possible duplicates: [Get unique elements from list of lists when the order of the sublists does not matter](https://stackoverflow.com/q/50769558/674039), [Efficiently remove duplicates, order-independent, from list of unordered sets](https://stackoverflow.com/q/57466243/674039) — wim, Jun 03 '20 at 02:04
Does this answer your question? [Efficiently remove duplicates, order-independent, from list of unordered sets](https://stackoverflow.com/questions/57466243/efficiently-remove-duplicates-order-independent-from-list-of-unordered-sets) — ggorlen, Jun 03 '20 at 02:05
I went ahead and edited your title to match your example, but feel free to rollback if that's not your intent. — ggorlen, Jun 03 '20 at 02:06

ggorlen · Answer 1 · 2020-06-03T02:50:23.247

2

groupby works if your duplicate sublists are already adjacent and you need to squeeze them into single units:

>>> from itertools import groupby
>>> [next(v) for _, v in groupby(c, sorted)]
[[1, 2, 3], [4, 5, 6]]

Caling sorted ignores order when grouping, so we can skim the first item from each group to obtain your result.

Otherwise, for the general case, using a dictionary comprehension like

>>> list({tuple(sorted(x)): x for x in c}.values())
[(1, 2, 3), (4, 6, 5)]

works but it only selects the last-seen item in c. If you reverse-iterate c, you'll get the first-seen:

>>> list({tuple(sorted(x)): x for x in c[::-1]}.values())[::-1]
[(1, 2, 3), (4, 5, 6)]

Be aware that sorted is O(n(log(n))), resulting in an overall complexity of

longest_len = max(map(len, c))
O(n * longest_len * log(longest_len))

If you need to scale to large inner lists, consider collections.Counter instead of sorted.

edited Jun 03 '20 at 02:50

answered Jun 03 '20 at 01:55

ggorlen

44,755
7
76
106

Groupby works but is suboptimal. You should mention that the input `c` needs to be *already* sorted into consecutive groups in order for groupby to work correctly here. – wim Jun 03 '20 at 02:09
Yes, that's a good point. I didn't think about the problem very carefully. – ggorlen Jun 03 '20 at 02:11
Updated... the dupe targets are best but I'm surprised they have so few views. – ggorlen Jun 03 '20 at 02:16

Remove any duplicates in a list of lists when order doesn't matter (eg. [1,2,3] and [1,3,2] are duplicate sets)?

Question

1 Answers1