4

I'm looking for a way to calculate the cumulative sum with numpy, but don't want to roll forward the value (or set it to zero) in case the cumulative sum is very close to zero and negative.

For instance

a = np.asarray([0, 4999, -5000, 1000])
np.cumsum(a)

returns [0, 4999, -1, 999]

but, I'd like to set the [2]-value (-1) to zero during the calculation. The problem is that this decision can only be done during calculation as the intermediate result isn't know a priori.

The expected array is: [0, 4999, 0, 1000]

The reason for this is that I'm getting very small values (floating point, not integers as in the example) which are due to floating point calculations which should in reality be zero. Calculating the cumulative sum compounds those values which leads to errors.

orange
  • 7,755
  • 14
  • 75
  • 139
  • How large are the values in the "desired" part of the array? If the negative values are near zero anyway, how much damage is done to your calculation by just letting them be accumulated? – mtrw Feb 10 '16 at 13:17
  • Indeed, see my comment http://stackoverflow.com/questions/35309237/conditional-numpy-cumulative-sum?noredirect=1#comment58336268_35310527. I arrived at this conclusion as well. – orange Feb 10 '16 at 13:19

3 Answers3

1

The Kahan summation algorithm could solve the problem. Unfortunately, it is not implemented in numpy. This means a custom implementation is required:

def kahan_cumsum(x):
    x = np.asarray(x)
    cumulator = np.zeros_like(x)
    compensation = 0.0

    cumulator[0] = x[0]    
    for i in range(1, len(x)):
        y = x[i] - compensation
        t = cumulator[i - 1] + y
        compensation = (t - cumulator[i - 1]) - y
        cumulator[i] = t
    return cumulator

I have to admit, this is not exactly what was asked for in the question. (A value of -1 at the 3rd output of the cumsum is correct in the example). However, I hope this solves the actual problem behind the question, which is related to floating point precision.

MB-F
  • 22,770
  • 4
  • 61
  • 116
  • Thanks. Good to know. But the numerical error isn't a result of the aggregation, but has already happened in a previous (more complex) calculation. – orange Feb 10 '16 at 10:59
  • I see. Are you sure the error is actually a problem? The sum is usually very robust to small deviations. If the values are close enough to zero (up to numerical precision), maybe the resulting sums are close enough to the desired results. Just a thought :) – MB-F Feb 10 '16 at 11:04
  • You are probably right. The compounded error may not be such a big deal as it's close to zero anyway and doesn't go through lots of further iterations in the cumsum loop. I may have overcomplicated this in hope for a "cleaner" solution... – orange Feb 10 '16 at 11:18
0

I wonder if rounding will do what you are asking for:

np.cumsum(np.around(a,-1))
# the -1 means it rounds to the nearest 10

gives

array([   0, 5000,    0, 1000])

It is not exactly as you put in your expected array from your answer, but using around, perhaps with the decimals parameter set to 0, might work when you apply it to the problem with floats.

Lee
  • 29,398
  • 28
  • 117
  • 170
  • I don't think this will work. The intermediate value for the decision whether or not to round is created during the cumsum calculation. It's first occurrence can be rounded post cumsum, but it would already have been added to the next elements in the array which I try to avoid. – orange Feb 10 '16 at 10:58
  • @orange coule you add an example with floats to your question? – Lee Feb 10 '16 at 13:33
0

Probably the best way to go is to write this bit in Cython (name the file cumsum_eps.pyx):

cimport numpy as cnp
import numpy as np

cdef inline _cumsum_eps_f4(float *A, int ndim, int dims[], float *out, float eps):
    cdef float sum
    cdef size_t ofs

    N = 1
    for i in xrange(0, ndim - 1):
        N *= dims[i]
    ofs = 0
    for i in xrange(0, N):
        sum = 0
        for k in xrange(0, dims[ndim-1]):
            sum += A[ofs]
            if abs(sum) < eps:
                sum = 0
            out[ofs] = sum
            ofs += 1

def cumsum_eps_f4(cnp.ndarray[cnp.float32_t, mode='c'] A, shape, float eps):
    cdef cnp.ndarray[cnp.float32_t] _out
    cdef cnp.ndarray[cnp.int_t] _shape
    N = np.prod(shape)
    out = np.zeros(N, dtype=np.float32)
    _out = <cnp.ndarray[cnp.float32_t]> out
    _shape = <cnp.ndarray[cnp.int_t]> np.array(shape, dtype=np.int)
    _cumsum_eps_f4(&A[0], len(shape), <int*> &_shape[0], &_out[0], eps)
    return out.reshape(shape)


def cumsum_eps(A, axis=None, eps=np.finfo('float').eps):
    A = np.array(A)
    if axis is None:
        A = np.ravel(A)
    else:
        axes = list(xrange(len(A.shape)))
        axes[axis], axes[-1] = axes[-1], axes[axis]
        A = np.transpose(A, axes)
    if A.dtype == np.float32:
        out = cumsum_eps_f4(np.ravel(np.ascontiguousarray(A)), A.shape, eps)
    else:
        raise ValueError('Unsupported dtype')
    if axis is not None: out = np.transpose(out, axes)
    return out

then you can compile it like this (Windows, Visual C++ 2008 Command Line):

\Python27\Scripts\cython.exe cumsum_eps.pyx
cl /c cumsum_eps.c /IC:\Python27\include /IC:\Python27\Lib\site-packages\numpy\core\include
F:\Users\sadaszew\Downloads>link /dll cumsum_eps.obj C:\Python27\libs\python27.lib /OUT:cumsum_eps.pyd

or like this (Linux use .so extension/Cygwin use .dll extension, gcc):

cython cumsum_eps.pyx
gcc -c cumsum_eps.c -o cumsum_eps.o -I/usr/include/python2.7 -I/usr/lib/python2.7/site-packages/numpy/core/include
gcc -shared cumsum_eps.o -o cumsum_eps.so -lpython2.7

and use like this:

from cumsum_eps import *
import numpy as np
x = np.array([[1,2,3,4], [5,6,7,8]], dtype=np.float32)

>>> print cumsum_eps(x)
[  1.   3.   6.  10.  15.  21.  28.  36.]
>>> print cumsum_eps(x, axis=0)
[[  1.   2.   3.   4.]
 [  6.   8.  10.  12.]]
>>> print cumsum_eps(x, axis=1)
[[  1.   3.   6.  10.]
 [  5.  11.  18.  26.]]
>>> print cumsum_eps(x, axis=0, eps=1)
[[  1.   2.   3.   4.]
 [  6.   8.  10.  12.]]
>>> print cumsum_eps(x, axis=0, eps=2)
[[  0.   2.   3.   4.]
 [  5.   8.  10.  12.]]
>>> print cumsum_eps(x, axis=0, eps=3)
[[  0.   0.   3.   4.]
 [  5.   6.  10.  12.]]
>>> print cumsum_eps(x, axis=0, eps=4)
[[  0.   0.   0.   4.]
 [  5.   6.   7.  12.]]
>>> print cumsum_eps(x, axis=0, eps=8)
[[ 0.  0.  0.  0.]
 [ 0.  0.  0.  8.]]
>>> print cumsum_eps(x, axis=1, eps=3)
[[  0.   0.   3.   7.]
 [  5.  11.  18.  26.]]

and so on, of course normally eps would be some small value, here integers are used just for the sake of demonstration / easiness of typing.

If you need this for double as well the _f8 variants are trivial to write and another case has to be handled in cumsum_eps().

When you're happy with the implementation you should make it a proper part of your setup.py - Cython setup.py

Update #1: If you have good compiler support in run environment you could try [Theano][3] to implement either compensation algorithm or your original idea:

import numpy as np
import theano
import theano.tensor as T
from theano.ifelse import ifelse

A=T.vector('A')

sum=T.as_tensor_variable(np.asarray(0, dtype=np.float64))

res, upd=theano.scan(fn=lambda cur_sum, val: ifelse(T.lt(cur_sum+val, 1.0), np.asarray(0, dtype=np.float64), cur_sum+val), outputs_info=sum, sequences=A)

f=theano.function(inputs=[A], outputs=res)

f([0.9, 2, 3, 4])

will give [0 2 3 4] output. In either Cython or this you get at least +/- performance of the native code.

datenwolf
  • 159,371
  • 13
  • 185
  • 298
ALGOholic
  • 644
  • 3
  • 7
  • Yes, I might roll my own in cython, as it doesn't look like there's a straight numpy solution. – orange Feb 10 '16 at 10:52
  • Btw, you don't need to precompile your pyx file. You can just import it using `pyximport.install` and it compiles during first import. – orange Feb 10 '16 at 10:53
  • I assume that run environment is not build environment (i.e. there is no C compiler). – ALGOholic Feb 10 '16 at 11:00
  • I see. The run environment has the capability to compile in my case. – orange Feb 10 '16 at 11:22
  • You could try Theano as well if you have good compiler support. `import theano import theano.tensor as T from theano.ifelse import ifelse A=T.vector('A') sum=T.as_tensor_variable(np.asarray(0, dtype=np.float64)) res, upd=theano.scan(fn=lambda cur_sum, val: ifelse(T.lt(cur_sum+val, 1.0), np.asarray(0, dtype=np.float64), cur_sum+val), outputs_info=sum, sequence=A) f=theano.function(inputs=[A], outputs=res) f([0.9, 2, 3, 4])` will give [0 2 3 4] output – ALGOholic Feb 10 '16 at 13:05
  • I get `[0. 2. 5. 9.]` on my machine with your `theano` code instead of `[0 2 3 4]` from your answer--the only change is the `print()` call: `print(f[0.9, 2, 3, 4])`, to see the results. Assuming the code is in `theano_algoholic.py` file, I run it using: `docker run --rm -it -v "$PWD:/prj" kaixhin/theano python /prj/theano_algoholic.py` – jfs Oct 28 '16 at 08:13