14

I have a list of strings similar to this list:

tags = ('apples', 'apricots', 'oranges', 'pears', 'peaches')

How should I go about grouping this list by the first character in each string using itertools.groupby()? How should I supply the 'key' argument required by itertools.groupby()?

Adam Ziolkowski
  • 153
  • 1
  • 1
  • 7

4 Answers4

26

You might want to create dict afterwards:

from itertools import groupby

d = {k: list(v) for k, v in groupby(sorted(tags), key=lambda x: x[0])}
Matthew
  • 10,361
  • 5
  • 42
  • 54
Pratik Deoghare
  • 35,497
  • 30
  • 100
  • 146
  • 1
    ...But don't forget to sort it first! – Matthew Jun 15 '20 at 06:19
  • @Matthew why do we need to sort? – leleogere Sep 29 '22 at 14:36
  • 1
    @leleogere `itertools.groupby()` requires the iterable to be sorted, see https://docs.python.org/3/library/itertools.html#itertools.groupby – Matthew Sep 29 '22 at 15:49
  • Thanks, I see why now. However, I don't understand the formulation: "***Generally**, the iterable needs to already be sorted on the same key function*". Why generally and not always then? – leleogere Sep 30 '22 at 07:13
16
groupby(sorted(tags), key=operator.itemgetter(0))
Elias Zamaria
  • 96,623
  • 33
  • 114
  • 148
Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
  • 1
    It works on unicodes. If you're asking if it works on UTF-8 strings, then you should instead be asking when you should decode it to a unicode. The answer, of course, is as soon as it comes in. – Ignacio Vazquez-Abrams Mar 18 '10 at 17:47
  • Thanks, it works as expected. I do have a list of tags in multiple languages and I'll be testing the ordering with various translators. – Adam Ziolkowski Mar 18 '10 at 23:11
  • 1
    Actually, it should be: `groupby(sorted(tags), key=operator.itemgetter(0))` – sandyp Jan 26 '16 at 19:18
5
>>> for i, j in itertools.groupby(tags, key=lambda x: x[0]):
    print(i, list(j))


a ['apples', 'apricots']
o ['oranges']
p ['pears', 'peaches']
SilentGhost
  • 307,395
  • 66
  • 306
  • 293
2

just another way,

>>> from collections import defaultdict
>>> t=defaultdict(list)
>>> for items in tags:
...     t[items[0]].append(items)
...
>>> t
defaultdict(<type 'list'>, {'a': ['apples', 'apricots'], 'p': ['pears', 'peaches'], 'o': ['oranges']})
ghostdog74
  • 327,991
  • 56
  • 259
  • 343