1

I have a list of records from a column, the list is named as dates. I am trying to get different dates out of the list. The list have many repetitive dates, such as 1/1/2010,1/1/2010, …. but there are different dates too. But if i use:

for date in dates: ....

it's repeating the loop for every single date(no matter if it is the same or not), not different dates. How could I tell it to do:

for differentdate in dates:... 

The language is Python!!

Acorn
  • 49,061
  • 27
  • 133
  • 172
widget
  • 945
  • 3
  • 13
  • 22

4 Answers4

5
for date in set(dates):

set() makes a collection out of the unique elements in another collection. Note: this may not preserve the order of the original list, so if you require that order to be preserved, go with @GregHewgill's answer.

Rafe Kettler
  • 75,757
  • 21
  • 156
  • 151
  • 1
    Note that this destroys ordering, i.e. you get the items in an undetermined order (well, it's deterministic, but you almost never know/care about some of the things that influence it, so it might as well be random). It's possible to retain order and filter dupes, but that requires about 5 lines more. –  Mar 25 '11 at 20:01
  • @delnan: I added a note, Greg has addressed this with groupby – Rafe Kettler Mar 25 '11 at 20:02
  • set(dates) works perfectly for me!! thank you so much. i knew it would be a minor change of codes – widget Mar 25 '11 at 21:06
5

You can use the itertools module to group by the dates. For example:

>>> import itertools
>>> a = ["aaa", "bbb", "bbb", "ccc"]
>>> for k, g in itertools.groupby(a):
...   print(k)
... 
aaa
bbb
ccc

This preserves the original order of the elements in a (which could be important for you). Inside the loop, g is a generator that produces a sequence containing each element with that key. See the documentation for itertools.groupby for more information.

Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
  • 1
    Typo in the URL. Also, this won't eliminate the dupe in `['a', 'b', 'a']`, which is propably what OP wants (you can `.sort()` to fix that though). –  Mar 25 '11 at 20:03
  • groupby seems interesting, but order does not matter to me, I just want values.Tks @delnan: u r right, and .sort() is enough to solve the order problem too. – widget Mar 25 '11 at 20:38
0

Either of the following:

def uniqueItems(seq, key=None, reverse=False):
    "Returns a list of unique items in (customizable) order"
    seq = list(set(seq))
    seq.sort(key=key, reverse=reverse)

def uniqueItems(seq):
    "Generator - return unique items in original order of first occurrence"
    seen = set()
    for item in seq:
        if item not in seq:
            yield item
            seen.add(item)

can be used as

for date in uniqueItems(dates):
    # do something with date
    pass
Hugh Bothwell
  • 55,315
  • 8
  • 84
  • 99
0

If preserving order was important, the following generator function derived from a comment by Alex Martelli about the Remove duplicates from a sequence ActiveState recipe would work (and should also be relatively fast based to these bench-marks which included the original dictionary-based, non-generator Martelli exemplar):

dates = ["1/1/2010", "1/3/2010", "1/3/2010", "1/7/2010"]

def unique(seq, idfun=lambda x: x):
    seen = set()
    for item in seq:
        marker = idfun(item)
        if marker not in seen:
            seen.add(marker)
            yield item

for date in unique(dates):
    print date

# 1/1/2010
# 1/3/2010
# 1/7/2010

Another nice feature is that it's fairly flexible and can be adapted to other data structures by providing a custom idfun to use to retrieve the datum to be compared.

martineau
  • 119,623
  • 25
  • 170
  • 301