Numpy's bincount
returns the populations of individual bins. However, if some bins are empty, the corresponding value will be zero, thus the division by np.bincount(dig)
will fail. A quick fix would be
sol = np.bincount(dig, data) / np.array([max(1, v) for v in np.bincount(dig)])
i.e., to divide by 1 instead of 0 for such bins since in this case we know that the bin is empty and thus the corresponding value in np.bincount(dig, data)
is also zero (however, this would depend on how you want to interpret the mean of an empty bin). This will give:
[ 0. -45.5 2. 3. 5.5 8. 65.5]
The first element here is not phony, but it corresponds to the zero bin index which would aggregate data smaller than min(bin_s)
. However, since this number is in your case -np.inf
, there are no such data. But it might happen that even some intermediary bins turn out to be empty. For example if you take as the input data:
data = np.array([-90,-1,2,3,10,121])
Then np.bincount
returns [0 2 1 1 0 0 2]
, so one needs to handle the other zeros as well, not only disregard the first element...
Also, you might consider binned_statistic provided by scipy which does this directly:
import numpy as np
from scipy.stats import binned_statistic as bstat
data = np.array([-90,-1,2,3,5,6,8,10,121])
stat = bstat(data, data, statistic = 'mean', bins = [-np.inf, 1, 3, 5, 8, 9, +np.inf])
print(stat[0])