1

What would be the most efficient way to generate groups of similar items in a large list in the given order? ex: ['a','b','b','c','c','c','b','c','c'] --> [['a'],['b','b'],['c','c','c'],['b'],['c','c']]

My current solution is to use a simple iterator and work my way through the list. This works fine. But i would love to know of better alternatives. Especially when dealing with very long lists of 1 million entries or more.

def grouper(items):
    grp = []
    for i in items:
        if grp and i != grp[-1]:
            if grp:
                yield grp
            grp = [i]
        else:
            grp.append(i)

    if grp:
        yield grp        

new = []
for x in grouper(['a','b','b','c','c','c','b','c','c']):
    # Do something useful with this group
    for x in grouper(data):
        print '#>>>%s'%x

#>>>['a']
#>>>['b', 'b']
#>>>['c', 'c', 'c']
#>>>['b']
#>>>['c', 'c']
Fnord
  • 5,365
  • 4
  • 31
  • 48

0 Answers0