0

Given k sorted arrays of size n each, merge them and print the sorted output.

The algorithm I followed is

  • iterate of over each array
    • pick the ith index in k arrays
    • insert() in minheap
    • delMin() and append result array.

from heapq import heappop, heappush

def merge_k_arrays(list_of_lists):
    result = [] #len(list_of_lists[0])*len(list_of_lists)
    minHeap= []
    n, k=0,0

    print(list_of_lists)
    while n < len(list_of_lists[0]):
        if n ==0:# initial k size heap ready
            while k < len(list_of_lists):
                element= list_of_lists[k][n]
                heappush(minHeap ,element )
                k+=1
            result.append(heappop(minHeap))
        else: # one at a time.
            k =0
            while k < len(list_of_lists):
                element = list_of_lists[k][n]
                heappush(minHeap , element)
                result.append(heappop(minHeap))
                k+=1
        n += 1

    # add the left overs in the heap
    while minHeap:
        result.append(heappop(minHeap))

    return result

Input:

input = [   [1, 3, 5, 7],
            [2, 4, 6, 8],
            [0, 9, 10, 11],

        ] 

Output:

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

input:

input = [   [1, 3, 5, 7],
            [2, 4, 6, 8],
            [3, 3, 3, 3],
            [7, 7, 7,7]
        ]

output:

[0, 1, 2, 3, 3, 3, 4, 5, 6, 3, 7, 7, 7, 7, 3, 7, 8, 9, 10, 11]

Could anyone help me know what piece is missing from my algorithm in order to merge the duplicate arrays in the second input too?

georgexsh
  • 15,984
  • 2
  • 37
  • 62
Anu
  • 3,198
  • 5
  • 28
  • 49
  • Is there a reason you're doing this instead of just using `heapq.merge`, which already exists and performs this exact functionality? (Technically, it's a generator function, where your function returns a `list`, but `list(heapq.merge(*list_of_lists))` would do your job for you) – ShadowRanger Nov 23 '18 at 02:45
  • @ShadowRanger, yes, I am curious to see how this common algorithm works without relying on the libraries. – Anu Nov 23 '18 at 18:06
  • Gotcha. Just a heads up, `heapq.merge` is actually implemented in Python (no C accelerators), so if you want a reference implementation, it's available. If you use `ipython` for interactive work (everyone should), simply importing `heapq`, then typing `heapq.merge??` will display the source code. – ShadowRanger Nov 23 '18 at 19:16

2 Answers2

0
from heapq import *

def mergeSort (arr):
    n = len(arr)
    minHeap = []
    
    for i in range(n):
        heappush(minHeap, (arr[i][0], i, arr[i]))
    
    print(minHeap)
    
    result = []
    while len(minHeap) > 0:
        num, ind, arr = heappop(minHeap)
        result.append(num)
        
        if len(arr) > ind + 1:
            heappush(minHeap, (arr[ind+1], ind+1, arr))
        
    
    return result
    
    

input = [   [1, 3, 5, 7],
            [2, 4, 6, 8],
            [0, 9, 10, 11],
            [-100]

        ] 
        
print(mergeSort(input))
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Apr 08 '22 at 07:26
  • Above code is giving wrong output. [1, 2, 3, 3, 3, 5, 6, 7, 7, 8] – Sxc-Dev May 29 '22 at 17:29
-1

To fix your code, move the result.append(heappop(minHeap)) in your second nested while loop to the outside of the nested while loop, like in your first nested while loop. This will make your code work.

        else: # one at a time.
        k =0
        while k < len(list_of_lists):
            element = list_of_lists[k][n]
            heappush(minHeap , element)

            k+=1
        result.append(heappop(minHeap))
    n += 1

If you have any space constraints, this is still problematic since you are adding nearly your entire input into the heap. If space is not an issue, there is a much cleaner way to write your solution:

def merge(A):
    result = []
    heap = [e for row in A for e in row]
    heapify(heap)
    for i in range(len(heap)):
        result.append(heappop(heap))
    return result

Otherwise, you will need to use a smarter solution that only allows the heap to have k elements in total, with one element from each list, and the new element you add each step should come from the origin list of the element that was just popped.

samuelli97
  • 61
  • 1
  • 8
  • Thanks for the suggestion, but your algorithm is both time & space inefficient. The `asymptotic analysis of your algorithm`: `3rd line will take O(n*k)`, `4th line will run in O(n*k)`, `6th line will run in O(n*k log n*k)`, so total runtime of your algorithm is `n*k log n*k` which is not efficient. On the other hand, in my algo, I am trying to keep the `heapsize fixed to k(number of given arrays to merge)`. in first loop, I `build this k size heap in O(k log k)`, then I `pop the element which takes O(k log k)` and then adds text from input to heap that takes `O(n log n)` & repeat. – Anu Nov 23 '18 at 20:10
  • so, my algo. worst case runtime would be `O(n log n)` with space only `O(k)`, but what I am stuck at is my method is failing for duplicate input at the back because restoring heap doesn't sort the already present items which is barrier to allow duplicates. could you suggest a solution to this problem.? – Anu Nov 23 '18 at 20:14