21

I have a problem with making a histogram when some of my data contains "not a number" values. I can get rid of the error by using nan_to_num from numpy, but than i get a lot of zero values which mess up the histogram as well.

pylab.figure()
pylab.hist(numpy.nan_to_num(A))
pylab.show()

So the idea would be to make another array in which all the nan values are gone, or to just mask them in the histogram in some way (preferrably with some builtin method).

jabaldonedo
  • 25,822
  • 8
  • 77
  • 77
usethedeathstar
  • 2,219
  • 1
  • 19
  • 30

1 Answers1

41

Remove np.nan values from your array using A[~np.isnan(A)], this will select all entries in A which values are not nan, so they will be excluded when calculating histogram. Here is an example of how to use it:

>>> import numpy as np
>>> import pylab

>>> A = np.array([1,np.nan, 3,5,1,2,5,2,4,1,2,np.nan,2,1,np.nan,2,np.nan,1,2])

>>> pylab.figure()
>>> pylab.hist(A[~np.isnan(A)])
>>> pylab.show()

enter image description here

jabaldonedo
  • 25,822
  • 8
  • 77
  • 77
  • that works, thanks (i can only accept your answer in 4 min) Not entirely sure where you found that ~ statement in the documentation, but it works – usethedeathstar Sep 30 '13 at 09:02
  • 1
    @usethedeathstar [Here's the `~`](http://docs.scipy.org/doc/numpy/reference/generated/numpy.invert.html), and [here are all of the bitwise operators as implemented in numpy](http://docs.scipy.org/doc/numpy/reference/routines.bitwise.html) – askewchan Sep 30 '13 at 21:05
  • 2
    If you didn't know about the `~` operator, you could just use `A[np.isfinite(A)]` which is possibly more what you want anyway. – askewchan Sep 30 '13 at 21:07
  • note that outside numpy, `~` is not the same as `!` (`not`) and `~` will sometimes return something you don't expect (for example `-2`, which evaluates into `True`). Please either use `!`, or better yet spell out `not` – alexey Jan 19 '18 at 00:01