20

I want to add numpy arrays with datatyp uint8. I know that the values in these arrays may be large enough for an overflow to happen. So I get something like:

a = np.array([100, 200, 250], dtype=np.uint8)
b = np.array([50, 50, 50], dtype=np.uint8)
a += b

Now, a is [150 250 44]. However, instead of an overflow I want values which are too large for uint8 to be the maximum allowed for uint8. So my desired result would be [150 250 255].

I could get this result with the following code:

a = np.array([100, 200, 250], dtype=np.uint8)
b = np.array([50, 50, 50], dtype=np.uint8)
c = np.zeros((1,3), dtype=np.uint16)
c += a
c += b
c[c>255] = 255
a = np.array(c, dtype=np.uint8)

The problem is, that my arrays are really big so creating a third array with a larger datatype could be a memory issue. Is there a fast and more memory efficient way to achieve the described result?

kmario23
  • 57,311
  • 13
  • 161
  • 150
Thomas
  • 1,277
  • 1
  • 12
  • 20
  • [DIPlib](https://diplib.org)’s integer addition saturates. DIPlib functions work directly on NumPy arrays, and you can convert between its image type and NumPy arrays without copying the data. – Cris Luengo Nov 12 '22 at 21:13

7 Answers7

9

You can achieve this by creating a third array of dtype uint8, plus a bool array (which together are more memory efficient that one uint16 array).

np.putmask is useful for avoiding a temp array.

a = np.array([100, 200, 250], dtype=np.uint8)
b = np.array([50, 50, 50], dtype=np.uint8)
c = 255 - b  # a temp uint8 array here
np.putmask(a, c < a, c)  # a temp bool array here
a += b

However, as @moarningsun correctly points out, a bool array takes the the same amount of memory as a uint8 array, so this isn't necessarily helpful. It is possible to solve this by avoiding having more than one temp array at any given time:

a = np.array([100, 200, 250], dtype=np.uint8)
b = np.array([50, 50, 50], dtype=np.uint8)
b = 255 - b  # old b is gone shortly after new array is created
np.putmask(a, b < a, b)  # a temp bool array here, then it's gone
a += 255 - b  # a temp array here, then it's gone

This approach trades memory consumption for CPU.


Another approach is to precalculate all possible results, which is O(1) extra memory (i.e. independent of the size of your arrays):

c = np.clip(np.arange(256) + np.arange(256)[..., np.newaxis], 0, 255).astype(np.uint8)
c
=> array([[  0,   1,   2, ..., 253, 254, 255],
          [  1,   2,   3, ..., 254, 255, 255],
          [  2,   3,   4, ..., 255, 255, 255],
          ..., 
          [253, 254, 255, ..., 255, 255, 255],
          [254, 255, 255, ..., 255, 255, 255],
          [255, 255, 255, ..., 255, 255, 255]], dtype=uint8)

c[a,b]
=> array([150, 250, 255], dtype=uint8)

This approach is the most memory-efficient if your arrays are very big. Again, it is expensive in processing time, because it replace the super-fast integer additions with the slower 2dim-array indexing.

EXPLANATION OF HOW IT WORKS

Construction of the c array above makes use of a numpy broadcasting trick. Adding an array of shape (N,) and array of shape (1,N) broadcast both to be (N,N)-like, thus the result is an NxN array of all possible sums. Then, we clip it. We get a 2dim array that satisfies: c[i,j]=min(i+j,255) for each i,j.

Then what's left is using fancy indexing the grab the right values. Working with the input you provided, we access:

c[( [100, 200, 250] , [50, 50, 50] )]

The first index-array refers to the 1st dim, and the second to the 2nd dim. Thus the result is an array of the same shape as the index arrays ((N,)), consisting of the values [ c[100,50] , c[200,50] , c[250,50] ].

shx2
  • 61,779
  • 13
  • 130
  • 153
  • 1
    Did not know about `putmask`, thanks for that! Using that function, I think `a += b` followed by `np.putmask(a, a –  Apr 13 '15 at 17:46
  • @moarningsun I think you are correct. However it relies on overflowing, which I personally don't feel perfectly comfortable with... – shx2 Apr 13 '15 at 17:48
  • @moarningsun why did you delete your answer? I think it is a decent answer and it works – shx2 Apr 13 '15 at 18:17
  • That method requires `putmask` to be efficient, which was one of the main points from your answer. So I figured I might as well put it as a comment here. –  Apr 13 '15 at 18:20
  • I tried your third method and it works perfectly fine. However, could you explain a little bit more, what it does? I compared the execution time of this third method ('precalculate') with the case were I just use a uint16 array for a, do a+=b and a[a>255]=255. 'Precalculate' takes approximately double the time. But if it is more memory efficient I will be able to do the calculations with large arrays for which I run out of memory with the uint16 approach. – Thomas Apr 14 '15 at 08:18
8

Here is a way:

>>> a = np.array([100, 200, 250], dtype=np.uint8)
>>> b = np.array([50, 50, 50], dtype=np.uint8)
>>> a+=b; a[a<b]=255
>>> a
array([150, 250, 255], dtype=uint8)
  • This works because an overflow will be like making a number negative. 200 + 100 => 44 because 200 is 56 away from 256(=0) so -56+100 = 44. So the result will always be smaller than b if an overflow happened. – Gelliant Oct 08 '21 at 12:20
4

How about doing

>>> a + np.minimum(255 - a, b)
array([150, 250, 255], dtype=uint8)

in general getting the max value for your datatype with

np.iinfo(np.uint8).max
YXD
  • 31,741
  • 15
  • 75
  • 115
2

You can do it truly inplace with Numba, for example:

import numba

@numba.jit('void(u1[:],u1[:])', locals={'temp': numba.uint16})
def add_uint8_inplace_clip(a, b):
    for i in range(a.shape[0]):
        temp = a[i] + b[i]
        a[i] = temp if temp<256 else 255

add_uint8_inplace_clip(a, b)

Or with Numexpr, for example:

import numexpr

numexpr.evaluate('where((a+b)>255, 255, a+b)', out=a, casting='unsafe')

Numexpr upcasts uint8 to int32 internally, before putting it back in the uint8 array.

2
def non_overflowing_sum(a, b)
    c = np.uint16(a)+b
    c[np.where(c>255)] = 255
    return np.uint8( c )

it trades memory too but I found more elegant and the temporary uint16 is freed after conversion on return

2

OpenCV has such a function: cv2.addWeighted

Maksim Surov
  • 603
  • 6
  • 22
-1

There's a function in numpy for this:

numpy.nan_to_num(x)[source]

Replace nan with zero and inf with finite numbers.

Returns an array or scalar replacing Not a Number (NaN) with zero, (positive) infinity with a very large number and negative infinity with a very small (or negative) number.

New Array with the same shape as x and dtype of the element in x with the greatest precision.

If x is inexact, then NaN is replaced by zero, and infinity (-infinity) is replaced by the largest (smallest or most negative) floating point value that fits in the output dtype. If x is not inexact, then a copy of x is returned.

I'm not sure if it will work with uint8, because of the mention of floating point in the output, but for other readers, it may be useful

Community
  • 1
  • 1
Toshinou Kyouko
  • 334
  • 9
  • 21
  • 1
    I don't see how this could help with the question. There are no NaNs or infinite values in any of the arrays that shall be added. So maybe I'm missing the point of your answer? – Thomas Feb 29 '16 at 05:45
  • @Thomas hmm, perhaps it is different for integer types, but when I encountered the problem with floats, the overflows appeared as +/- infinities – Toshinou Kyouko Feb 29 '16 at 08:49
  • @ToshinouKyouko Yes, it is indeed different for integers, they simply overflow as in the example of OP. – luator May 31 '17 at 09:59