3

I am intersecting two lists with the following code:

def interlist(lst1,lst2): 
    lst3 = list(filter(lambda x: x in lst1, lst2))

    return lst3

The thing, is that I want to count every intersection between lst1 and lst2. The result should be a dictionary mapping elements to the number of times they overlap in both lists.

kaya3
  • 47,440
  • 4
  • 68
  • 97
Cipher
  • 77
  • 1
  • 9
  • What do you mean by count? If you just want the length of the result you can just do `return len(lst3)`? – CozyAzure Nov 28 '19 at 02:04
  • By *"count every intersection"* do you mean a dictionary showing how many times each element overlaps, or just the count of all of them, or just the number of distinct ones? – kaya3 Nov 28 '19 at 02:05
  • @kaya3 i mean a dictionary showing how many times each element overlaps – Cipher Nov 28 '19 at 02:07
  • I think you need to clarify something: Say an element x exists twice in list1 and 3 times in list3. x is clearly in the intersection. How many x's are supposed to exist in your desired outcome? – FatihAkici Nov 28 '19 at 02:18
  • @FatihAkici Something like this: lst1=["a" , "b" , "c"] lst2=["a" , "a" , "a" , "b", "b"] OUTPUT:{"a"=3, "b"=2} Something like how many times the elements from lst 1 repeats on lst2 – Cipher Nov 28 '19 at 02:26
  • Perfect! How about `lst2=["a" , "b" , "c"]`, `lst1=["a" , "a" , "a" , "b", "b"]`? – FatihAkici Nov 28 '19 at 02:29

2 Answers2

4

Here's a simple solution using collections.Counter and set intersection. The idea is to first count occurrences of each element in each list separately; then, the number of overlaps is the min of the two counts, for each element. This matches each occurrence in one list with a single occurrence in the other, so the min gives the number of matches that can be made. We only need to count elements which occur in both lists, so we take the intersection of the two key-sets.

If you want to count all matching pairs instead (i.e. each occurrence in lst1 gets matched with every occurrence in lst2), replace min(c1[k], c2[k]) with c1[k] * c2[k]. This counts the number of ways of choosing a pair with one occurrence from lst1 and one from lst2.

from collections import Counter

def count_intersections(lst1, lst2):
    c1 = Counter(lst1)
    c2 = Counter(lst2)
    return { k: min(c1[k], c2[k]) for k in c1.keys() & c2.keys() }

Example:

>>> lst1 = ['a', 'a', 'a', 'b', 'b', 'c', 'e']
>>> lst2 = ['a', 'b', 'b', 'b', 'd', 'e']
>>> count_intersections(lst1, lst2)
{'b': 2, 'a': 1, 'e': 1}

This solution runs in O(m + n) time and uses at most O(m + n) auxiliary space, where m and n are the sizes of the two lists.

kaya3
  • 47,440
  • 4
  • 68
  • 97
  • it works! It is possible to also returns something like this: lst1=["a" , "b" , "c"] lst2=["a" , "a" , "a" , "b", "b"] {"a"=3, "b"=2} Something like how many times the elements from lst 1 repeats on lst2 – Cipher Nov 28 '19 at 02:24
  • If you want to count all pairs of intersections, replace `min(c1[k], c2[k])` with `c1[k] * c2[k]`. – kaya3 Nov 28 '19 at 02:26
  • Thanks you it works. Could you kindly explain me what you code does. I kinda understant it but i am new in the programming world – Cipher Nov 28 '19 at 02:38
0

Per your clarification of:

If lst1 = ["a", "b", "c"], lst2 = ["a", "a", "a", "b", "b"] then output = {"a": 3, "b": 2}, you can simply do:

output = {}
for x in set(lst1):
    cnt = lst2.count(x)
    if cnt > 0:
        output[x] = cnt
FatihAkici
  • 4,679
  • 2
  • 31
  • 48
  • This is a nice simple solution, but it takes O(mn) time. It's best to avoid using `.count` inside a loop, since that scans the whole list on every iteration; it's more efficient to do all the counts in one scan using a Counter. – kaya3 Nov 28 '19 at 02:46