1

I've searched around for pointers on this question but couldn't find any. Suppose I have a list in Python:

list = set([((3, 2), (2, 1)),
            ((3, 2), (3, 1)),
            ((3, 1), (2, 1)), 
            ((2, 1), (1,3), (2, 3))])

I want to refine this list so that entries of the list containing pairs with the same first element are thrown out. So for example, the output for the list above should be

set([((3, 2), (2, 1)),
     ((3, 1), (2, 1))])

Because ((3, 2), (3, 1)) and ((2, 1), (1,3), (2, 3)) are elements in which at least two of the coordinate pairs have the same first entry. Is there a fast and easy way to do this?

As it stands, I am thinking of doing something like

[x for x in list if ... ]

where I loop over the list by fixing x[k][0] and going through and comparing each x[i][0] with varying i to x[k][0], then looping over all such k's. I feel there has to be a better way to do this. Hope I was clear enough in this question, and I greatly appreciate your help.

Sam Mussmann
  • 5,883
  • 2
  • 29
  • 43

3 Answers3

3

You could use

def throw_out_elements(iterable):
    for x in iterable:
       if len(set(y for y, _ in x)) == len(x):
            yield x

Then to use this:

S = set([((3, 2), (2, 1)),
            ((3, 2), (3, 1)),
            ((3, 1), (2, 1)), 
            ((2, 1), (1,3), (2, 3))])
print list(throw_out_elements(S))

output: [((3, 2), (2, 1)), ((3, 1), (2, 1))]
Stuart
  • 9,597
  • 1
  • 21
  • 30
3

This can be done quite easily with a simple set comprehension and a simple function:

def no_duplicates(x):
    seen = set()
    return not any(i in seen or seen.add(i) for i in x)

data = {((3, 2), (2, 1)),
        ((3, 2), (3, 1)),
        ((3, 1), (2, 1)),
        ((2, 1), (1,3), (2, 3))}

print({item for item in data if no_duplicates(first for first, _ in item)})

Producing:

{((3, 2), (2, 1)), 
 ((3, 1), (2, 1))}

We take each item if the first element of each pair in the item is unique. We use the simple no_duplicates() function (pulled from this great answer) to do this, which does what it says on the tin.

Community
  • 1
  • 1
Gareth Latty
  • 86,389
  • 17
  • 178
  • 183
1

If you're dead set on a single list comprehension, the following would work.

my_list = set([((3, 2), (2, 1)),
        ((3, 2), (3, 1)),
        ((3, 1), (2, 1)),
        ((2, 1), (1,3), (2, 3))])

[x for x in my_list if len(set([y[0] for y in x])) == len(x)]

Edit: First answer was wrong as I misread the question.

jeffknupp
  • 5,966
  • 3
  • 28
  • 29
  • This is a duplicate of my answer, except with the function inlined. Edit: As @Stuart points out below, inlined incorrectly, so this is wrong. – Gareth Latty Dec 26 '12 at 21:00
  • But unlike @Lattyware's answer I don't think it will do what the questioner is asking for... read it carefully – Stuart Dec 26 '12 at 21:00
  • Note that checking the length of the set means that all elements must be consumed into the set (it's not lazy). – Gareth Latty Dec 26 '12 at 21:25
  • 1
    It's not meant to be. If the length of pairs is significantly greater than the example given, this may of course be slightly less efficient than creating a function that breaks on the first duplicate (though I'd guess that for a reasonably high value of len, function call overhead would dominate the time spent). – jeffknupp Dec 26 '12 at 21:34