0

I am trying to debug a merge method in a program for merge-sort. Here is the method:

def merge(self, leftarray, rightarray):
        n = len(leftarray) + len(rightarray)
        print "len(leftarray) = "+str(len(leftarray))
        print "len(rightarray) = "+str(len(rightarray))
        i = 0
        j = 0
        merged = []

        for k in range(n):
            if i == len(leftarray):
                merged.extend(rightarray[j:])
                k += len(rightarray[j:])

            elif j == len(rightarray):
                merged.extend(leftarray[i:])
                k += len(leftarray[i:])

            elif leftarray[i] <= rightarray[j]:
                merged.append(leftarray[i])
                i += 1

            elif leftarray[i] > rightarray[j] :
                merged.append(rightarray[j])
                j += 1

        return merged

The for k in range(n) is the loop exhibiting the problem. Here is the debugger trace:

> /home/**/Documents/**/**/merge_sort.py(36)merge()
-> elif j == len(rightarray):
(Pdb) n
> /home/**/Documents/**/**/merge_sort.py(37)merge()
-> merged.extend(leftarray[i:])
(Pdb) n
> /home/**/Documents/**/**/merge_sort.py(39)merge()
-> k += len(leftarray[i:])
(Pdb) n
> /home/**/Documents/**/**/merge_sort.py(30)merge()
-> for k in range(n):
(Pdb) p k
3
(Pdb) n
> /home/**/Documents/**/**/merge_sort.py(31)merge()
-> if i == len(leftarray):
(Pdb) p n
3
(Pdb) 

As can be seen in the trace, the value of n is 3 and execution should not enter the loop when k is 3. However execution goes to the line if i == len(leftarray): instead of return merged.

Flame of udun
  • 2,136
  • 7
  • 35
  • 79

2 Answers2

2

Python shows you the for k in range(n) line before it executes it. So k is still bound to the result from the last command that touched it:

k += len(leftarray[i:])

You need to print k after the for loop has assigned the next value in the range(n) series to k.

To map this out more by annotating your debug steps:

  1.  

    > /home/**/Documents/**/**/merge_sort.py(36)merge()
    -> elif j == len(rightarray):
    

    k is untouched and still set to what for k in range(n) set it to.

    (Pdb) n
    

    You now executed the j == len(rightarray) test and it is found to be true so you step to:

  2.  

    > /home/**/Documents/**/**/merge_sort.py(37)merge()
    -> merged.extend(leftarray[i:])
    (Pdb) n
    > /home/**/Documents/**/**/merge_sort.py(39)merge()
    -> k += len(leftarray[i:])
    

    merged is extended, you stepped to the k += len(leftarray[i:]) line. k is still bound to the value the loop assigned to it.

    (Pdb) n
    

    You now executed k += len(leftarray[i:]) and k is now set to 3, replacing whatever value it was bound to before.

  3.  

    > /home/**/Documents/**/**/merge_sort.py(30)merge()
    -> for k in range(n):
    

    This line hasn't executed yet, so k is still bound to 3:

    (Pdb) p k
    3
    (Pdb) n
    

    Now k no longer will be 3, it'll be the next value in the range(n) sequence.

  4.  

    > /home/**/Documents/**/**/merge_sort.py(31)merge()
    -> if i == len(leftarray):
    (Pdb) p n
    3
    

    You need to now test k again; it'll be 1 or 2, depending on where you were in the iteration process.

The for loop determines the iterator when the loop starts; you provided it with a range() sequence, and it'll follow that sequence you set it. You cannot alter k and expect it to not be replaced by the next value from the series.

You'd use a while loop to do that instead:

k = 0
while k < n:
    # do things, including altering k further
    k += 1
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
2

This is an interesting gotcha with the for-range idiom: it is not a C for loop. The loop will run n times with the values k=0 to n-1, regardless of how k is modified in the loop body. If you want to simulate a C-style for loop (where you can terminate early by changing the loop variable), use a while loop instead.

nneonneo
  • 171,345
  • 36
  • 312
  • 383