I got 2 functions which essentially do the same thing, and I want to compare their runtime performance.
def calculate_bonus_by_table(data, value):
cdf = data["cdf"]
# CAUTION: the following loop doesn't do bound checking
i = 0
while (value > cdf[i]): i += 1
return data['bonus'][i]
def calculate_bonus_by_regression(data, value):
max_slope = len(data['bonus']) - 1
slope = int((value - 1) / 6)
if slope > max_slope:
slope = max_slope
slope += 1
return (0.205078125 * slope**2) + (0.68359375 * slope) - 70.888671875
data = json.load(open('bonus.json'))
A snippet of the JSON file used above
{ "cdf": [6, 12, 18, 24, 30, 36, ...], "bonus": [-70, -68, -66, -64, -62, ...] }
In iPython, I time both the function seperately
%timeit calculate_bonus_by_table(data, 199)
1000000 loops, best of 3: 1.64 µs per loop
%timeit calculate_bonus_by_regression(data, 199)
The slowest run took 7.99 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 604 ns per loop
running timeit for both functions multiple times always gives me similar results. The .._by_regression function result always gives warning about caching.
How can I compare the 2 if one is cached while the other isn't? Why the _by_regression function is cached while _by_table isn't? Will _by_regression performance and caching hold when used in production environment, or should I assume the worst performace (from above, 7.99 * 604ns = 4.83us)?
Thanks