How to efficiently cache and apply scipy gammaln to a numpy array?

Question

I am looking for a way to cache the results of the gammaln function, apply it to a numpy array and then sum the results. Currently I am able to cache the gammln results using a simple dictionary approach, but it seems that I cannot iterate the array efficiently. Below is an illustrative code snipet of the problem.

import numpy as np
from scipy.special import gammaln

gammaln_cache_dic = {}
def gammaln_cache(x):
    if x not in gammaln_cache_dic:
        val = gammaln(x)
        gammaln_cache_dic[x] = val
    else:
        val = gammaln_cache_dic[x]
    return val

def cache_gammaln_sum(M):
    res_sum = 0.0
    for x in np.fromiter( [ x for x in M ], int ):
        res_sum += gammaln_cache(x)
    return res_sum

np_array = np.random.randint(5, size=(500))*1000+3
start = time.time()
for i in np_array:
    gammaln(i)
end = time.time()
print("NO cache gammaln %f secs" % (end - start))

start = time.time()
for i in np_array:
    gammaln_cache(i)
end = time.time()
print("cache gammaln %f secs" % (end - start))

start = time.time()
gammaln(np_array).sum()
end = time.time()
print("NO cache gammaln sum %f secs" % (end - start))

start = time.time()
cache_gammaln_sum(np_array)
end = time.time()
print("cache gammaln sum %f secs" % (end - start))

Paul Panzer · Answer 1 · 2017-03-04T21:30:04.270

2

You could use np.unique

uniq, counts = np.unique(data, return_counts=True)
result = np.dot(gammaln(uniq), counts)

uniq are the sorted unique values in data, the return_counts keyword instructs the function to also return the multiplicities.

The second line evaluates gammaln once for each distinct element, multiplies with the number of occurrences and sums.

You could also keep gammaln(uniq) and use it as a look_up table.

look_up = gammaln(uniq)
indices = np.searchsorted(uniq, new_samples)
hit = uniq[indices] == new_samples
result = look_up[indices[hit]]
miss = new_samples[~hit]

edited Mar 04 '17 at 21:30

answered Mar 04 '17 at 21:04

Paul Panzer

51,835
3
54
99

@hpaulj I suppose the entire caching thing works best with ints. For continuous functions one would rather setup some grid-based look-up, Essentially what the scipy interp functions do. – Paul Panzer Mar 05 '17 at 01:07

score 2 · Answer 2 · answered Mar 04 '17 at 23:17

gammaln, like most functions from scipy.special is a ufunc, i.e. given a numpy array, it operates elementwise and returns a numpy array:

In [1]: import numpy as np

In [2]: from scipy.special import gammaln

In [3]: x = np.arange(1, 8)

In [4]: gammaln(x)
Out[4]: 
array([ 0.        ,  0.        ,  0.69314718,  1.79175947,  3.17805383,
        4.78749174,  6.57925121])

In [5]: gammaln(x).sum()
Out[5]: 17.029703434928095

In fact, if you give it a list of numbers, it'll convert it to an array internally, so

In [6]: gammaln(list(range(1, 8)))
Out[6]: 
array([ 0.        ,  0.        ,  0.69314718,  1.79175947,  3.17805383,
        4.78749174,  6.57925121])

Also note that ufunc iterations happen in the compiled code. Thus it's unlikely you can do much better performance-wise with any sort of caching and python iterations etc.

`gammaln` may not be the best example for an expensive function, but the general idea of avoiding redundant evaluations has merit. — Paul Panzer, Mar 05 '17 at 01:29

hpaulj · Accepted Answer · 2017-03-05T05:41:36.713

If a function has been written to take advantage of the numpy compiled code (or itself is compiled), it will be hard to improve on its performance with caching.

In [1]: from scipy.special import gammaln
In [2]: np_array = np.random.randint(5, size=(500))*1000+3
In [3]: np_array.shape
Out[3]: (500,)
In [4]: timeit gammaln(np_array).sum()
....
10000 loops, best of 3: 34.9 µs per loop

Evaluating each element will be much slower

In [5]: timeit sum([gammaln(i) for i in np_array])
1000 loops, best of 3: 1.8 ms per loop

Simply iterating through the array is slower.

In [6]: timeit sum([i for i in np_array])
10000 loops, best of 3: 127 µs per loop

In this case the values are a small set of integers

In [10]: np.unique(np_array)
Out[10]: array([   3, 1003, 2003, 3003, 4003])

But identifying those unique values takes nearly as long as calculating the function for all 500.

In [11]: timeit np.unique(np_array)
...
In [12]: timeit gammaln(np.unique(np_array))

In fact it is slower if I also ask for indices that can reconstruct the array:

In [14]: timeit np.unique(np_array, return_inverse=True)    
10000 loops, best of 3: 54 µs per loop

There are caching or memorizing recipes for Python, including a @functools.lru_cache decorator. But I haven't seen them much with numpy arrays. If your problem is suitable for cacheing I suspect it is best to do that in lower level code, such as with Cython, and using existing C or C++ cache libraries.

Cache with unique

We could get cache like behavior with unique - use it to get both the unique values and the indices that lets us recreate the original array:

In [5]: unq, idx=np.unique(np_array, return_inverse=True)
In [6]: res1= gammaln(np_array)
In [7]: res2= gammaln(unq)
In [8]: res2= res2[idx]
In [9]: np.allclose(res1, res2)
Out[9]: True

But compare the times:

In [10]: timeit res1= gammaln(np_array)
10000 loops, best of 3: 29.1 µs per loop

In [11]: %%timeit
    ...: unq, idx=np.unique(np_array, return_inverse=True)
    ...: res2= gammaln(unq)[idx]
10000 loops, best of 3: 63.3 µs per loop

Most of this extra time is in calculating unique; once we have the mapping, applying it is quite fast:

In [12]: timeit res2=gammaln(unq)[idx]
...
100000 loops, best of 3: 6.12 µs per loop

unique with sum

Returning the sum of values rather than actual values shifts relative times a bit

In [13]: %timeit res1= gammaln(np_array).sum()
10000 loops, best of 3: 35.3 µs per loop
In [14]: %%timeit
    ...: unq, idx=np.unique(np_array, return_inverse=True)
    ...: res2= gammaln(unq)[idx].sum()
10000 loops, best of 3: 71.3 µs per loop

With Paul Panzer's count approach

In [21]: %%timeit 
    ...: unq, cnt=np.unique(np_array, return_counts=True)
    ...: res=(gammaln(unq)*cnt).sum()
10000 loops, best of 3: 59.5 µs per loop
In [22]: %%timeit 
    ...: unq, cnt=np.unique(np_array, return_counts=True)
    ...: res=np.dot(gammaln(unq),cnt)
10000 loops, best of 3: 52.1 µs per loop

How to efficiently cache and apply scipy gammaln to a numpy array?

3 Answers3

Cache with unique

unique with sum