Python algorithm to compare two sorted lists and count how many elements are the same

Question

I have to design an algorithm that compares two sorted lists of the same length and return the number of common values between them.

So if I have two lists a = [2, 9, 15, 27, 36, 40] and b = [9, 11, 15, 23, 36, 44], the algorithm should return the value of 3 as 9, 15 and 36 are present in both lists.

I know that there might be an easier to do with using sets, but since I'm trying to learn data structures and algorithms I'd prefer to do it the longer(harder way).

My current code uses any array merge algorithm which is non-working at the moment as I'm still confused as to r1 and r2, though i think they would be the right most element in the array, but I don't know how to get that. eg. r1 = 40 (from list a), and r2 = 44 (from list b)?

global a
a = [2, 9, 15, 27, 36, 40]

global b
b = [9, 11, 15, 23, 36, 44]

global c
c = []

def merge (a1, a, r1, a2, b, r2, c, list3):
    i = a
    j = b
    k = c
    r1 = 
    r2 = 
    while i <= r1 and j <= r2:
        if a1[i]<=a2[j]:
            a3[k] = a1[i]
            i += 1
        elif a3[k] >= a2[j]:
            j += 1
            k += 1
    while i <= r1:  
        a3[k] = a1[i]
        i += 1
        k += 1 
    while j <= r2:  
        a3[k] = a2[j]
        j += 1
        k += 1

Thank you for the help and feedback.

A naive solution should be doable in O(NlogN) time. What kind of solution are you looking for? — cs95, May 08 '18 at 02:14
@cᴏʟᴅsᴘᴇᴇᴅ as I am using a merge sort algorithm wouldn't O(n log(n)) be the best in regards to time complexity? If not what would you recommend? Please excuse me if I'm wrong I'm still learning these things. — MacGenius, May 08 '18 at 02:19
Since the lists are already sorted, why do you want to run mergesort? What I was thinking of: If your lists don't contain duplicates, you can iterate over list 1 and do a binary search on list 2. — cs95, May 08 '18 at 02:21
As this is part of some homework the lecturer stated to make an algorithm that does what the tile suggests. He wants us to use an array merge algorithm to compare the two lists and and modify it to increment a count if there are common values. — MacGenius, May 08 '18 at 02:29
`merge()` is just `O(n)` - you have no need to `merge_sort()`. Counting duplicates is also just `O(n)` — AChampion, May 08 '18 at 02:52

score 2 · Accepted Answer · edited May 08 '18 at 07:12

Okay if I read your question correctly you want to find common elements in two sorted list of equals lengths and return the number of common elements. I am a little confused by the use of merge here.

Anyways if that is what you want your algorithm to do. Since it is already sorted we can simply iterate over both the lists and find the common elements in linear time.

Algorithm:

Let i and j be the indices of a1 and a2 respectively initialized to 0
If a1[i] < a2[j] we know that the a1[i] does not exist in a2 as i and j point to the smallest element in the respective arrays. So we move i forward.
Same with a2[j] < a1[i].
If a1[i] == a2[j] then we have found a common element and we advance both i and j by 1 and continue till the end of either of the array.

the code

def find_common(a1, a2):
    list_len = len(a1)
    a3 = []
    i = j = 0
    while i < list_len and j < list_len:
        if a1[i] < a2[j]:
            i += 1
        elif a2[j] < a1[i]:
            j += 1
        else:
            a3.append(a1[i])    
            i +=1
            j +=1
    return a3

a = [2, 9, 15, 27, 36, 40]
b = [9, 11, 15, 23, 36, 44]
print(find_common(a, b))

Thank you, this works great, just had to add `print(len(find_common(a, b)))` to get the number of common values between them. — MacGenius, May 08 '18 at 11:37
If you just need the number that just have a counter instead of the list `a3`. e.g. `c = 0` and replace `a3.append(a1[i])` with `c += 1` then `find_common()` would just return `3` — AChampion, May 08 '18 at 14:24
@AChampion Yes, you're right. I added it so that it would be a little clearer. — thebenman, May 08 '18 at 14:37

AChampion · Answer 2 · 2018-05-08T13:39:53.400

r1 and r2 are just the lengths of the 2 lists.
Doing a simple merge on 2 lists is not as complicated as your example, here is a simplified merge:

def merge(a1, a2):
    r1, r2 = len(a1), len(a2)
    a3 = []
    i = j = 0

    while i < r1 and j < r2:
        if a1[i] < a2[j]:
            a3.append(a1[i])
            i += 1
        else:
            a3.append(a2[j])
            j += 1

    while i < r1:
        a3.append(a1[i])
        i += 1
    while j < r2:
        a3.append(a2[j])
        j += 1

    return a3

In []:
a = [2, 9, 15, 27, 36, 40]
b = [9, 11, 15, 23, 36, 44]
merge(a, b)

Out[]:
[2, 9, 9, 11, 15, 15, 23, 27, 36, 36, 40, 44]

Doing a duplicate count is simpler than this as you don't need to construct a new list but this should give you the basis for doing that count, and is only O(n).

shouldn't `a3.append(a2[i])` be `a3.append(a2[j])` – Ishpreet May 08 '18 at 05:54 — Ishpreet, May 08 '18 at 05:54

Vajjhala · Answer 3 · 2018-05-08T07:01:20.183

Using hash table you can solve this in linear time.

You could store one list in a hash table using python dictionary where the key would be the element (in this case an integer) and the value would be the number of occurrences of the element. Running time: O(n)

Then iterate through the other list and do a hash table lookup for each element. Keep a variable for counting the common values. Running time: O(n).

To avoid counting duplicates, as you iterate check if the previous element is the same, in which case move to the next element. You will need an extra variable to keep track of the previous element.

Python algorithm to compare two sorted lists and count how many elements are the same

3 Answers3