Mean each row of nonzero values and avoid RuntimeWarning and NaN as some rows are all zero

Question

I already checked Numpy mean of nonzero values and it worked nicely. However, some rows of my matrix are all zero element. What is a good way to avoid RuntimeWarning: invalid value encountered in true_divide in this case? Also, I don't want the zero element to be replaced by Nan here.

eachPSM = np.ones([3,4])
eachPSM[0] = 0
print eachPSM
>> [[ 0.  0.  0.  0.]
 [ 1.  1.  1.  1.]
 [ 1.  1.  1.  1.]]
print np.true_divide(eachPSM.sum(1),(eachPSM!=0).sum(1))
>> RuntimeWarning: invalid value encountered in true_divide
[ nan   1.   1.]

To get a more specific answer, it would help to know what do you want that element to be, if not nan. — , Oct 22 '17 at 19:48

Divakar · Accepted Answer · 2017-10-23T08:18:34.613

2

With a as the input array, you could use masking -

invalid_val = np.nan # specifies mean value to be assigned for all zeros rows
out = np.full(a.shape[0],invalid_val)
count = (a!=0).sum(1)
valid_mask = count!=0
out[valid_mask] = a[valid_mask].sum(1)/count[valid_mask]

edited Oct 23 '17 at 08:18

answered Oct 22 '17 at 17:07

Divakar

218,885
19
262
358

Let me ask why we can have two equal signs in `valid_mask = count!=0`? – Jan Oct 23 '17 at 07:19
1

@Jan `count!=0` is just a comparison of not-equality, which gives us a boolean array, which is then assigned to `valid_mask`. – Divakar Oct 23 '17 at 08:18

score 0 · Answer 2 · answered Oct 22 '17 at 10:59

0

import warnings
...
with warnings.catch_warnings():
  warnings.simplefilter("ignore", category=RuntimeWarning)

eachPSM[np.isnan(eachPSM)] = 0

answered Oct 22 '17 at 10:59

Eric Bridger

3,751
1
19
34

score 0 · Answer 3 · answered Oct 22 '17 at 11:00

0

Since anything divided by 1 is same as the numerator you can fill zero by 1 i.e

x = eachPSM.sum(1)
y = (eachPSM!=0).sum(1)
y[y==0] =  1 
np.true_divide(x,y)

#array([ 0.,  1.,  1.])

answered Oct 22 '17 at 11:00

Bharath M Shetty

30,075
6
57
108

score 0 · Answer 4 · answered Oct 22 '17 at 13:10

Masked array provide elegant solutions :

eachPSM = np.ones([3,4])
eachPSM[0] = 0
eachPSM[1,1] = 0

#[[ 0.  0.  0.  0.]
# [ 1.  0.  1.  1.]
# [ 1.  1.  1.  1.]]

In [39]: np.ma.masked_equal(eachPSM,0).mean(1)
Out[39]: 
masked_array(data = [-- 1.0 1.0],
             mask = [ True False False],
       fill_value = 1e+20)

In [40]: np.ma.masked_equal(eachPSM,0).mean(1).data
Out[40]: array([ 0.,  1.,  1.])

Mean each row of nonzero values and avoid RuntimeWarning and NaN as some rows are all zero

4 Answers4