Optimizing Kadane's Algorithm for numpy

Question

The standard way to find the maximum subarray is Kadene's algorithm. If the input is a large numpy array, is there anything faster than the native python implementation?

import timeit

setup = '''
import random
import numpy as np

def max_subarray(A):
    max_so_far = max_ending_here = 0
    for x in A:
        max_ending_here = max(0, max_ending_here + x)
        max_so_far      = max(max_so_far, max_ending_here)
    return max_so_far

B = np.random.randint(-100,100,size=100000)
'''

print min(timeit.Timer('max_subarray(B)',setup=setup).repeat(5, 100))

What do you mean by better? Faster? You can definitively get faster than a pure python implementation. You can optimize with Cython, write it in pure C and include it via `ctypes` or `Cython` e.g. — cel, Nov 13 '14 at 21:55
@cel yes, sorry if I wasn't clear. better => faster. Since the numpy array is already an optimized data type (e.g. fixed data type, contiguous array, etc.), I was wondering if there were (numpy) built-in operations that I could leverage. I hadn't thought the Cython route as I'm unfamiliar with it. — Hooked, Nov 13 '14 at 21:58
Do you simply want it to execute fast or to execute with a good time complexity? Simply doing a `cumsum` and a `sort` in numpy will be pretty fast regardless :) — Wolph, Nov 13 '14 at 22:02
@Wolph I'm looking for real-world execution time since Kadene is O(N) already. How would the cumsum/sort combo work? Wouldn't that require the array's initial index to be at the start? If it does work, compare the timings to the posted answer and I can accept it! — Hooked, Nov 13 '14 at 22:15
@Hooked: you're completely right :) I'm currently experimenting to solve this problem in a fast way but not too succesful so far. Guess the Cython path might be the easiest way to get it fast. I'll play a little bit more with numpy, usually you can get it fast with a little magic. — Wolph, Nov 13 '14 at 22:23
This kind of sequential array update is typically not possible to vectorize, unless it happens to conform to some built-in function. I don't think that is the case here, though... — Jaime, Nov 14 '14 at 00:14

Wolph · Accepted Answer · 2014-11-17T16:54:10.967

Little test with Cython in an iPython notebook (because of that no timeit, doesn't appear to work with the %%cython environment :)

Original version:

import numpy as np

B = np.random.randint(-100,100,size=100000)

def max_subarray(A):
    max_so_far = max_ending_here = 0
    for x in A:
        max_ending_here = max(0, max_ending_here + x)
        max_so_far      = max(max_so_far, max_ending_here)
    return max_so_far

import time

measurements = np.zeros(100, dtype='float')
for i in range(measurements.size):
    a = time.time()
    max_subarray(B)
    measurements[i] = time.time() - a

print 'non-c:', measurements.min(), measurements.max(), measurements.mean()

Cython version:

%%cython

import numpy as np
cimport numpy as np

B = np.random.randint(-100,100,size=100000)

DTYPE = np.int
ctypedef np.int_t DTYPE_t

cdef DTYPE_t c_max_subarray(np.ndarray A):
    # Type checking for safety
    assert A.dtype == DTYPE

    cdef DTYPE_t max_so_far = 0, max_ending_here = 0, x = 0
    for x in A:
        max_ending_here = max(0, max_ending_here + x)
        max_so_far      = max(max_so_far, max_ending_here)
    return max_so_far

import time

measurements = np.zeros(100, dtype='float')
for i in range(measurements.size):
    a = time.time()
    c_max_subarray(B)
    measurements[i] = time.time() - a

print 'Cython:', measurements.min(), measurements.max(), measurements.mean()

Results:

Cython: 0.00420188903809 0.00658392906189 0.00474049091339
non-c: 0.0485298633575 0.0644249916077 0.0522959709167

Definitely a notable increase without too much effort :)

You should really statically type `x` here. It's the loop variable and occurs in the arithmetic. I'm pretty sure you will gain MUCH speed. — cel, Nov 14 '14 at 11:24
I get: `Cython: 0.0273659229279 0.0550830364227 0.0315420007706` and `FastCython: 0.00628685951233 0.0138258934021 0.00742509126663`, when `x` is guaranteed to be an `int` — cel, Nov 14 '14 at 11:27
@cel For those not-versed in Cython, what specifically do you say to statically type x as an int? — Hooked, Nov 14 '14 at 15:44
in Cython you can use `cdef` to tell the compiler which type a variable should have. If you write `cdef int x` then you tell the compiler that it can expect that `x` always will be an `int`. This will give `x` a fixed (static) type for your whole program and allows the compiler to speed up your code. — cel, Nov 14 '14 at 16:14
@cel: completely forgot about that, I'll change it. Just become about 5 times faster :) — Wolph, Nov 17 '14 at 16:44

Optimizing Kadane's Algorithm for numpy

1 Answers1

Linked