Ordered list reduction
I need to reduce some lists where, depending on element types, the speed and implementation of the binary operation varies, i.e. large speed reductions can be gained by reducing some pairs with specific functions first.
For example foo(a[0], bar(a[1], a[2]))
might be a lot slower than bar(foo(a[0], a[1]), a[2])
but in this case give the same result.
I have the code that produces an optimal ordering in the form of a list of tuples (pair_index, binary_function)
already. I am struggling to implement an efficient function to perform the reduction, ideally one that returns a new partial function which can then be used repeatedly on lists of the same type-ordering but varying values.
Simple and slow(?) solution
Here is my naive solution involving a for loop, deletion of elements and closure over the (pair_index, binary_function)
list to return a 'precomputed' function.
def ordered_reduce(a, pair_indexes, binary_functions, precompute=False):
"""
a: list to reduce, length n
pair_indexes: order of pairs to reduce, length (n-1)
binary_functions: functions to use for each reduction, length (n-1)
"""
def ord_red_func(x):
y = list(x) # copy so as not to eat up
for p, f in zip(pair_indexes, binary_functions):
b = f(y[p], y[p+1])
# Replace pair
del y[p]
y[p] = b
return y[0]
return ord_red_func if precompute else ord_red_func(a)
>>> foos = (lambda a, b: a - b, lambda a, b: a + b, lambda a, b: a * b)
>>> ordered_reduce([1, 2, 3, 4], (2, 1, 0), foos)
1
>>> 1 * (2 + (3-4))
1
And how pre-compution works:
>>> foo = ordered_reduce(None, (0, 1, 0), foos)
>>> foo([1, 2, 3, 4])
-7
>>> (1 - 2) * (3 + 4)
-7
However it involves copying the whole list and is also (therefore?) slow. Is there a better/standard way to do this?
(EDIT:) Some Timings:
from operators import add
from functools import reduce
from itertools import repeat
from random import random
r = 100000
xs = [random() for _ in range(r)]
# slightly trivial choices of pairs and functions, to replicate reduce
ps = [0]*(r-1)
fs = repeat(add)
foo = ordered_reduce(None, ps, fs, precompute=True)
>>> %timeit reduce(add, xs)
100 loops, best of 3: 3.59 ms per loop
>>> %timeit foo(xs)
1 loop, best of 3: 1.44 s per loop
This is kind of worst case scenario, and slightly cheating as reduce does not take a iterable of functions, but a function which does (but no order) is still pretty fast:
def multi_reduce(fs, xs):
xs = iter(xs)
x = next(xs)
for f, nx in zip(fs, xs):
x = f(x, nx)
return x
>>> %timeit multi_reduce(fs, xs)
100 loops, best of 3: 8.71 ms per loop
(EDIT2): and for fun, the performance of a massively cheating 'compiled' version, which gives some idea of the total overhead occurring.
from numba import jit
@jit(nopython=True)
def numba_sum(xs):
y = 0
for x in xs:
y += x
return y
>>> %timeit numba_sum(xs)
1000 loops, best of 3: 1.46 ms per loop