Numpy: use bins with infinite range

Question

In my Python script I have floats that I want to bin. Right now I'm doing:

min_val = 0.0
max_val = 1.0
num_bins = 20
my_bins = numpy.linspace(min_val, max_val, num_bins)
hist,my_bins = numpy.histogram(myValues, bins=my_bins)

But now I want to add two more bins to account for values that are < 0.0 and for those that are > 1.0. One bin should thus include all values in ( -inf, 0), the other one all in [1, inf)

Is there any straightforward way to do this while still using numpy's histogram function?

score 11 · Accepted Answer · edited Jul 24 '12 at 15:44

11

The function numpy.histogram() happily accepts infinite values in the bins argument:

numpy.histogram(my_values, bins=numpy.r_[-numpy.inf, my_bins, numpy.inf])

Alternatively, you could use a combination of numpy.searchsorted() and numpy.bincount(), though I don't see much advantage to that approach.

edited Jul 24 '12 at 15:44

jmetz

12,144
3
30
41

answered Jul 24 '12 at 15:16

Sven Marnach

574,206
118
941
841

With matplotlib (`plt`), even though it uses numpy's `hist` internally, it does not accept the `inf` (drawing infinite boxes is too much? :-)). But a VERY large value (compared to the typical range of my data) worked well in my case. – Josiah Yoder Aug 03 '23 at 16:14

score 3 · Answer 2 · answered Jul 24 '12 at 15:14

3

You can specify numpy.inf as the upper and -numpy.inf as the lower bin limits.

answered Jul 24 '12 at 15:14

jmetz

12,144
3
30
41

score 0 · Answer 3 · answered Mar 20 '19 at 08:53

With Numpy version 1.16 you have histogram_bin_edges. With this, todays solution calls histogram_bin_edges to get the bins, concatenate -inf and +inf and pass this as bins to histogram:

a=[1,2,3,4,2,3,4,7,4,6,7,5,4,3,2,3]
np.histogram(a, bins=np.concatenate(([np.NINF], np.histogram_bin_edges(a), [np.PINF])))

Results in:

(array([0, 1, 3, 0, 4, 0, 4, 1, 0, 1, 0, 2]),
array([-inf,  1. ,  1.6,  2.2,  2.8,  3.4,  4. ,  4.6,  5.2,  5.8,  6.4, 7. ,  inf]))

if you prefer to have the last bin empty (as I do), you can use the range parameter and add a small number to max:

a=[1,2,3,4,2,3,4,7,4,6,7,5,4,3,2,3]
np.histogram(a, bins=np.concatenate(([np.NINF], np.histogram_bin_edges(a, range=(np.min(a), np.max(a)+.1)), [np.PINF])))

Results in:

(array([0, 1, 3, 0, 4, 4, 0, 1, 0, 1, 2, 0]),
array([-inf, 1.  , 1.61, 2.22, 2.83, 3.44, 4.05, 4.66, 5.27, 5.88, 6.49, 7.1 ,  inf]))

Numpy: use bins with infinite range

3 Answers3