Counting the number of inversions in merge sort

Question

I have implemented a merge sort and, as well as sorting, I would like it to compute the number of inversions in the original array.

Below is my attempt at implementing it, which for some reason doesn't compute the number of inversions correctly.

For example, mergeSort([4, 3, 2, 1]) should return (6, [1, 2, 3, 4]).

def mergeSort(alist):
    count = 0  
    if len(alist)>1:
        mid = len(alist)//2
        lefthalf = alist[:mid]
        righthalf = alist[mid:]

        mergeSort(lefthalf)
        mergeSort(righthalf)

        i=0
        j=0
        k=0

        while i < len(lefthalf) and j < len(righthalf):
            if lefthalf[i] < righthalf[j]:
                alist[k]=lefthalf[i]
                i=i+1
            else:
                alist[k]=righthalf[j]
                count +=len(lefthalf[i:])
                j=j+1    
            k=k+1

        while i < len(lefthalf):
            alist[k]=lefthalf[i]
            i=i+1
            k=k+1

        while j < len(righthalf):
            alist[k]=righthalf[j]
            j=j+1
            k=k+1   

   return count, alist

The indentation is wrong. That makes it hard to run this. Looking at it. — Kenny Ostrom, Feb 19 '17 at 18:34
What do you mean by a change? Which operations specifically are you trying to count? Mergesort does not use a "swap" operation. It uses a "merge" operation. E.g. if you merge `[3]` and `[1 4]`, you get `[1, 3, 4]`. So do you count item comparisons (there's just one between 1 and 3), inserts (there are three), or merges (this would be a single merge)? — naktinis, Feb 19 '17 at 18:54
For instance, having the array [1,3,2], if we want order you need to swap (3,2). So you will have one inversion. — Bruno Santos, Feb 19 '17 at 19:18
Quit asking what an inversion is. Click on the tag for a definition. It's a well defined term from any intro algorithms text. — Kenny Ostrom, Feb 19 '17 at 19:23
Fair enough, but it's not exactly "the number of changes necessary to order an array" as OP said. `4, 3, 2, 1` has an inversion number of 6 `{(1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)}`, but you only need 2 changes to sort the list: `(1, 4)` and `(2, 3)`. That's why I asked what OP meant by "change". — naktinis, Feb 19 '17 at 20:51

naktinis · Accepted Answer · 2017-02-19T19:50:05.907

2

The main problem was not including the counts of sorting left and right sides.

def mergeSort(alist):
    count = 0
    leftcount = 0
    rightcount = 0
    blist = [] 
    if len(alist) > 1:
       mid = len(alist) // 2
       lefthalf = alist[:mid]
       righthalf = alist[mid:]
       leftcount, lefthalf = mergeSort(lefthalf)
       rightcount, righthalf = mergeSort(righthalf)

       i = 0
       j = 0

       while i < len(lefthalf) and j < len(righthalf):
         if lefthalf[i] < righthalf[j]:
             blist.append(lefthalf[i])
             i += 1
         else:
             blist.append(righthalf[j])
             j += 1
             count += len(lefthalf[i:])

       while i < len(lefthalf):
          blist.append(lefthalf[i])
          i += 1

       while j < len(righthalf):
          blist.append(righthalf[j])
          j += 1
    else:
        blist = alist[:]

    return count + leftcount + rightcount, blist

edited Feb 19 '17 at 19:50

answered Feb 19 '17 at 18:59

naktinis

3,957
3
36
52

I am trying to count the number of changes necessary to order an array – Bruno Santos Feb 19 '17 at 19:13
Can you specify what counts as a change in merge sort (and give an example)? Some algorithms use a "swap" function, but merge sort doesn't. If you're trying to count the number of items that changed positions, maybe you could do that separately: first, sort the list and then compare each item of the new array with the original. – naktinis Feb 19 '17 at 19:17
For instance, having the array [1,3,2], if we want order you need to swap (3,2). So you will have one inversion. – Bruno Santos Feb 19 '17 at 19:21
You solved my problem. Thanks man! – Bruno Santos Feb 20 '17 at 19:33

score 0 · Answer 2 · answered Feb 19 '17 at 18:55

Your function returns a tuple of (inversion, sortedlist). However your internal recursive calls completely disregard this, so any inversions you count below the top level are simply tossed aside and not counted.

  lc, lefthalf = mergeSort(alist[:mid])
  rc, righthalf = mergeSort(alist[mid:])
  count = count + lc + rc

and if you share this with classmates, you may use this:

def count_inversions(data):
    count, result = mergeSort(data)
    return count

test_cases = [
    (3, [1,3,5,2,4,6]),
    (590, [37, 7, 2, 14, 35, 47, 10, 24, 44, 17, 34, 11, 16, 48, 1, 39, 6, 33, 43, 26, 40, 4, 28, 5, 38, 41, 42, 12, 13, 21, 29, 18, 3, 19, 0, 32, 46, 27, 31, 25, 15, 36, 20, 8, 9, 49, 22, 23, 30, 45]),
    (2372, [4, 80, 70, 23, 9, 60, 68, 27, 66, 78, 12, 40, 52, 53, 44, 8, 49, 28, 18, 46, 21, 39, 51, 7, 87, 99, 69, 62, 84, 6, 79, 67, 14, 98, 83, 0, 96, 5, 82, 10, 26, 48, 3, 2, 15, 92, 11, 55, 63, 97, 43, 45, 81, 42, 95, 20, 25, 74, 24, 72, 91, 35, 86, 19, 75, 58, 71, 47, 76, 59, 64, 93, 17, 50, 56, 94, 90, 89, 32, 37, 34, 65, 1, 73, 41, 36, 57, 77, 30, 22, 13, 29, 38, 16, 88, 61, 31, 85, 33, 54]),
]

def validate():
    for expected, data in test_cases:
        answer = count_inversions(data)
        if answer != expected:
            print "FAILED VALIDATION -- actual:", answer, "expected:", expected, "data:", data

validate()

Hmm. That must be old. I usually raise ValueError on a failed test case. — Kenny Ostrom, Feb 19 '17 at 18:59

Counting the number of inversions in merge sort

2 Answers2