6

I'm having a difficult time understanding what's happening in these two examples I found of the Kadane Algorithm. I'm new to Python and I'm hoping understanding this complex algo will help me see/read programs better.

Why would one example be better than the other, is it just List vs Range? Is there something else that makes one of the examples more efficient? Also, some questions about what's happening in the calculations. (questions inside the examples)

I've used PythonTutor to help me get a visual on what exactly is happening step by step.

Example 1:

In PythonTuter, when you select next step in the screen shot provided, The value of so_far turns to 1. How is this? Giving the sum, I've thought its adding -2 + 1 which is -1, so when so_far turns to 1, how is this?

        def max_sub(nums):
            max_sum = 0
            so_far = nums[0]
        
            for x in nums[1:]:
                so_far = max(x, x + so_far)
                max_sum = max(so_far, max_sum)
            return max_sum
        
nums = [-2, 1, -3, 4, -1, 2, 1, -5, 4]
    max_sub(nums)
                6 

enter image description here

Example 2:

Similar question for this one, when I select NEXT step, the max_sum turns from -2 to 4... but how so if it's adding the element in the 2 (which is 4). To me, that would be -2 + 4 = 2 ?

def maxSubArraySum(a,size):
     
    max_so_far =a[0]
    curr_max = a[0]
     
    for i in range(1,size):
        curr_max = max(a[i], curr_max + a[i])
        max_so_far = max(max_so_far,curr_max)
         
    return max_so_far
 
a = [-2, -3, 4, -1, -2, 1, 5, -3]
print("Maximum contiguous sum is" , maxSubArraySum(a,len(a)))
     Maximum contiguous sum is 7

enter image description here

So, this would be a 2 part question than:

[1]Based on understandings, why would one be more pythonic and more efficient than the other? 
[2]How can I better understand the calculations happening in the examples?
  • 1
    I'm not really understanding your questions. Yes, you have the right results for the additions - but that result is only one parameter to a `max()`, and in both cases the other parameter is higher and takes precedence. (Example 1 is badly broken, you're using `nums` directly rather than the parameter passed to the function.) – jasonharper Aug 31 '21 at 18:39
  • It seems that there is a typo in Example 1- def `max_sub(nums)` - please double check and confirm. Otherwise, it won't work. – Daniel Hao Aug 31 '21 at 18:46
  • 2
    The `for` loop in the first example is probably considered more Pythonic, but the difference in efficiency is minimal. You would need a very large list before you'd notice any difference. The line `for x in nums[1:]:` does the same thing as `for i in range(1, len(nums)): x = nums[i]` The second version is useful if you want to change the values in the list. The first version doesn't give you an index into the list that you could use to modify the list elements. – user3386109 Aug 31 '21 at 18:59
  • Will argue if it can get the number directly, why it has to rely on `range` then index? Example 1 is more readable to me. And Example 2 is strange to specify `size` as 2nd parameters, it's not flexible? – Daniel Hao Aug 31 '21 at 19:27

2 Answers2

1

Simply watch each step and you could figure out this problem: [Notes] this program seems to work based on the assumption of mixed integer numbers? only positive and negatives.

# starting
so_far  = -2   # init. to nums[0]
max_sum = 0

# in the for-loop:
x =  1         # starting with nums[1:]
so_far = max(1, -1)   -> 1   (x is 1, -2 + 1)
max_sum = max(0, 1)   -> 1
.....   continue .... each step is to find the max accumulated numbers sum, as it's evident in the max( ) statement.  *There is no `sum` involved, except it tried to determine the current x is good (not negative) then so add it to the so_far.
Daniel Hao
  • 4,922
  • 3
  • 10
  • 23
1

More performance measurement data points to compare these two different approaches shown that first example is definitely faster ~22-24% faster than 2nd one with input size of 2k.

if __name__ == '__main__':
    L = list(range(-1_000, 1_000, 1))
    random.shuffle(L)

    baseTestCase = partial(max_sub, nums=L)
    print(timeit.timeit(baseTestCase, number=100_000))  # 86.0588067
    
    rangeTestCase = partial(max_SubArraySum, a=L, size=len(L))
    print(timeit.timeit(rangeTestCase, number=100_000)) # 105.415955
Daniel Hao
  • 4,922
  • 3
  • 10
  • 23