-3

I have two lists which I want to return len() of similar values in a list.

A = [1,1,2,2]
B = [3,3,3,3,7,7,7]

In first list there are twice number 1 and 2, I want to use len of number values in the list, to see how many times number 1 repeats in first list. in that case will be 2 and 2 for number 2.

Pavel.D
  • 561
  • 1
  • 15
  • 41

3 Answers3

3

This is a job for collections.Counter

>>> from collections import Counter
>>> Counter([1,1,2,2])
Counter({1: 2, 2: 2})
>>> Counter([3,3,3,3,7,7,7])
Counter({3: 4, 7: 3})
nicholishen
  • 2,602
  • 2
  • 9
  • 13
  • only if you do not care about "stretches": [1,1,1,2,2,2,3,3,2,2,2,2,2,2] will give only 1 number for 2s – Patrick Artner Dec 14 '18 at 22:56
  • Thanks.. Quick reply, but it gives back as dic. So I need one more for-loop key and value to sort for exampel number 4 for 3 or 3 for 7. – Pavel.D Dec 14 '18 at 22:59
  • No just access the dict using the key directly. `>>> number_of_threes = Counter([3,3,3,3,7,7,7])[3]` ... `4` – nicholishen Dec 14 '18 at 23:01
2

Quick one single line solution that doesn't use collections counter.

A=[3,4,4,4,3,5,6,8,4,3]
duplicates=dict(set((x,A.count(x)) for x in filter(lambda rec : A.count(rec)>1,A)))
output:
{3: 3, 4: 4} 

This solution doesn't account for "stretches" however

Larr Bear
  • 93
  • 1
  • 12
  • Your answer is short and simple :) Thanks – Pavel.D Dec 14 '18 at 23:17
  • 3
    I just timed it and `collections.Counter` is over 100% faster @100,000 iterations each. – nicholishen Dec 14 '18 at 23:44
  • @nicholishen collections.counter has its time and place, and is clearly very effective for this situation. I edited post as it might have been misleading for this particular post. There are times when collections.counter is very slow and a creative solution could lead to better efficiency. [As seen in last answer here](https://stackoverflow.com/questions/43485195/python-collections-counter-efficiency) I wanted to allude that there are times it is not the most effective solution, but might've overstepped the boundary of this post in doing so – Larr Bear Dec 15 '18 at 03:12
  • @LarrBear Ironically, the reason your solution is slower than the `Counter` solution is the same reason that the `Counter` solution you linked to is slow. Your solution is roughly quadratic, iterating over `A` multiple times for each element in `A`. By contrast, building a `Counter` only iterates over `A` once. That single iteration is more expensive than any one of the other iterations, but it's markedly more efficient than all of them taken together. – Patrick Haugh Dec 15 '18 at 13:56
1

You can simply iterate over your numbers and count identical ones - or use itertools.groupby:

def count_em(l):
    """Returns a list of lenghts of consecutive equal numbers as list. 
    Example: [1,2,3,4,4,4,3,3] ==> [1,1,1,3,2]"""
    if not isinstance(l,list):
        return None

    def count():
        """Counts equal elements, yields each count"""
        # set the first elem as current
        curr = [l[0]]

        # for the rest of elements
        for elem in l[1:]:
            if elem == curr[-1]:
                # append as long as the element is same as last one in curr 
                curr.append(elem)
            else:
                # yield the number
                yield len(curr)
                # reset curr to count the new ones
                curr = [elem]
        # yield last group
        yield len(curr)

    # get all yields and return them as list
    return list(count())


def using_groupby(l):
    """Uses itertools.groupby and a list comp to get the lenghts."""
    from itertools import groupby
    grp = groupby(l) # this groups by the elems themselfs
    # count the grouped items and return as list
    return [ sum(1 for _ in items) for g,items in grp] 

Test:

A = [1,1,2,2]
B = [3,3,3,3,7,7,7]
C = [1,1,2,2,2,1,1,1,1,1,6,6]

for e in [A,B,C]:
    print(count_em(e),  using_groupby(e))

Output:

# count_em     using_groupby    Input
[2, 2]         [2, 2]         # [1,1,2,2]
[4, 3]         [4, 3]         # [3,3,3,3,7,7,7]
[2, 3, 5, 2]   [2, 3, 5, 2]   # [1,1,2,2,2,1,1,1,1,1,6,6]
Patrick Artner
  • 50,409
  • 9
  • 43
  • 69