Iterate by groups of similar items in Python

Asked Mar 19 '15 at 22:38

Active Mar 19 '15 at 22:38

Viewed 30 times

What would be the most efficient way to generate groups of similar items in a large list in the given order? ex: ['a','b','b','c','c','c','b','c','c'] --> [['a'],['b','b'],['c','c','c'],['b'],['c','c']]

My current solution is to use a simple iterator and work my way through the list. This works fine. But i would love to know of better alternatives. Especially when dealing with very long lists of 1 million entries or more.

def grouper(items):
    grp = []
    for i in items:
        if grp and i != grp[-1]:
            if grp:
                yield grp
            grp = [i]
        else:
            grp.append(i)

    if grp:
        yield grp        

new = []
for x in grouper(['a','b','b','c','c','c','b','c','c']):
    # Do something useful with this group
    for x in grouper(data):
        print '#>>>%s'%x

#>>>['a']
#>>>['b', 'b']
#>>>['c', 'c', 'c']
#>>>['b']
#>>>['c', 'c']

asked Mar 19 '15 at 22:38

Fnord

5,365
4
31
48

https://docs.python.org/2/library/itertools.html#itertools.groupby – wim Mar 19 '15 at 22:41

Iterate by groups of similar items in Python

0 Answers0