How to get unique values with respective occurrence count from a list in Python?

Question

I have a list which has repeating items and I want a list of the unique items with their frequency.

For example, I have ['a', 'a', 'b', 'b', 'b'], and I want [('a', 2), ('b', 3)].

Looking for a simple way to do this without looping twice.

Just so you know... the answer you accepted violates your "without looping twice" constraint. (I'm comment here so that you get notified :-). — Tom, Mar 06 '10 at 15:41
Can you just clarify your question a little bit too? Are your items always grouped together? Or can they appear in any order in the list? — Tom, Mar 06 '10 at 15:57
Yes, Tom. Although my question does not specify this - but in my particular situation, the values are coming sorted. Thanks. — Samantha Green, Mar 06 '10 at 16:02

score 73 · Answer 1 · edited Jan 28 '19 at 14:41

73

With Python 2.7+, you can use collections.Counter.

Otherwise, see this counter receipe.

Under Python 2.7+:

from collections import Counter
input =  ['a', 'a', 'b', 'b', 'b']
c = Counter( input )

print( c.items() )

Output is:

[('a', 2), ('b', 3)]

edited Jan 28 '19 at 14:41

jpp

159,742
34
281
339

answered Mar 06 '10 at 15:20

mmmmmm

32,227
27
88
117

Eli Bendersky · Accepted Answer · 2010-03-06T15:48:37.673

15

If your items are grouped (i.e. similar items come together in a bunch), the most efficient method to use is itertools.groupby:

>>> [(g[0], len(list(g[1]))) for g in itertools.groupby(['a', 'a', 'b', 'b', 'b'])]
[('a', 2), ('b', 3)]

edited Mar 06 '10 at 15:48

answered Mar 06 '10 at 15:18

Eli Bendersky

263,248
89
350
412

@Tom: I'm aware of this limitation. When the items are grouped, however, `groupby` is the efficient and preferred approach – Eli Bendersky Mar 06 '10 at 15:40
1

You should make that clear... notice the constraint in the question says "I have a list which has repeating items"... the list the OP gave was just an example. I don't think this solution is general enough. If the OP specified that the input list always had the elements grouped, I would agree. – Tom Mar 06 '10 at 15:44
@Tom: you're right - I've updated the answer (BTW I assumed from his "repeating items" that they're grouped) – Eli Bendersky Mar 06 '10 at 15:48
Ok Eli... thanks for the update :-). I revoke my -1 because your answer is now more clear. – Tom Mar 06 '10 at 15:58
1

Is there a way to sort the resulting tuple list by count? – geotheory Aug 18 '15 at 19:54
Yes! If `g` is the resulting object then `sorted(g, key=lambda x: x[1])` – geotheory Aug 18 '15 at 20:10

score 15 · Answer 3 · answered Mar 06 '10 at 16:50

15

>>> mylist=['a', 'a', 'b', 'b', 'b']
>>> [ (i,mylist.count(i)) for i in set(mylist) ]
[('a', 2), ('b', 3)]

answered Mar 06 '10 at 16:50

ghostdog74

327,991
56
259
343

score 7 · Answer 4 · answered Sep 07 '18 at 15:31

If you are willing to use a 3rd party library, NumPy offers a convenient solution. This is particularly efficient if your list contains only numeric data.

import numpy as np

L = ['a', 'a', 'b', 'b', 'b']

res = list(zip(*np.unique(L, return_counts=True)))

# [('a', 2), ('b', 3)]

To understand the syntax, note np.unique here returns a tuple of unique values and counts:

uniq, counts = np.unique(L, return_counts=True)

print(uniq)    # ['a' 'b']
print(counts)  # [2 3]

See also: What are the advantages of NumPy over regular Python lists?

score 3 · Answer 5 · answered Mar 06 '10 at 15:31

I know this isn't a one-liner... but to me I like it because it's clear to me that we pass over the initial list of values once (instead of calling count on it):

>>> from collections import defaultdict
>>> l = ['a', 'a', 'b', 'b', 'b']
>>> d = defaultdict(int)
>>> for i in l:
...  d[i] += 1
... 
>>> d
defaultdict(<type 'int'>, {'a': 2, 'b': 3})
>>> list(d.iteritems())
[('a', 2), ('b', 3)]
>>>

score 3 · Answer 6 · answered Mar 06 '10 at 16:34

3

the "old school way".

>>> alist=['a', 'a', 'b', 'b', 'b']
>>> d={}
>>> for i in alist:
...    if not d.has_key(i): d[i]=1  #also: if not i in d
...    else: d[i]+=1
...
>>> d
{'a': 2, 'b': 3}

answered Mar 06 '10 at 16:34

ghostdog74

327,991
56
259
343

score 1 · Answer 7 · answered Mar 06 '10 at 15:48

1

Another way to do this would be

mylist = [1, 1, 2, 3, 3, 3, 4, 4, 4, 4]
mydict = {}
for i in mylist:
    if i in mydict: mydict[i] += 1
    else: mydict[i] = 1

then to get the list of tuples,

mytups = [(i, mydict[i]) for i in mydict]

This only goes over the list once, but it does have to traverse the dictionary once as well. However, given that there are a lot of duplicates in the list, then the dictionary should be a lot smaller, hence faster to traverse.

Nevertheless, not a very pretty or concise bit of code, I'll admit.

answered Mar 06 '10 at 15:48

Aaron

1,072
6
16

This is identical in spirit to my solution... except defaultdict consolidates the first part (since you don't have to check for existence) and list(mydict.iteritems()) is shorter than the list comprehension. – Tom Mar 06 '10 at 15:55
`mytups = mydict.items()` is a simpler way to get the list of tuples. – PaulMcG Mar 06 '10 at 17:38
Thanks @Paul and @Tom. It seems like there is always a better way to do something in Python. :) – Aaron Mar 06 '10 at 18:07

score 1 · Answer 8 · answered Mar 06 '10 at 17:28

A solution without hashing:

def lcount(lst):
   return reduce(lambda a, b: a[0:-1] + [(a[-1][0], a[-1][1]+1)] if a and b == a[-1][0] else a + [(b, 1)], lst, [])

>>> lcount([])
[]
>>> lcount(['a'])
[('a', 1)]
>>> lcount(['a', 'a', 'a', 'b', 'b'])
[('a', 3), ('b', 2)]

score 1 · Answer 9 · edited Apr 29 '15 at 21:33

1

Convert any data structure into a pandas series s:

CODE:

for i in sort(s.value_counts().unique()):
  print i, (s.value_counts()==i).sum()

edited Apr 29 '15 at 21:33

Kevlar

8,804
9
55
81

answered Apr 29 '15 at 20:59

Ali Arar

11
1

score 0 · Answer 10 · answered May 15 '18 at 10:05

0

With help of pandas you can do like:

import pandas as pd
dict(pd.value_counts(my_list))

answered May 15 '18 at 10:05

zashishz

377
4
6

How to get unique values with respective occurrence count from a list in Python?

10 Answers10

Linked