10

I'm a little confused about the output of numpy.median in the case of masked arrays. Here is a simple example (assuming numpy is imported - I have version 1.6.2):

>>> a = [3.0, 4.0, 5.0, 6.0, numpy.nan]
>>> am = numpy.ma.masked_array(a, [numpy.isnan(x) for x in a])

I'd like to be able to use the masked array to ignore nanvalues in the array when calculating the median. This works for mean using either numpy.mean or the mean() method of the masked array:

>>> numpy.mean(a)
nan
>>> numpy.mean(am)
4.5
>>> am.mean()
4.5

However for median I get:

>>> numpy.median(am)
5.0

but I'd expect something more like this result:

>>> numpy.median([x for x in a if not numpy.isnan(x)])
4.5

and unfortunately a masked_array does not have a median method.

Paul Joireman
  • 2,689
  • 5
  • 25
  • 33

1 Answers1

15

Use np.ma.median on a MaskedArray.

[Explanation: If I remember correctly, the np.median does not support subclasses, so it fails to work correctly on np.ma.MaskedArray.]

Pierre GM
  • 19,809
  • 3
  • 56
  • 67
  • @PaulJoireman You're welcome. More often than not, the `np.ma` module implements the equivalent of `numpy` functions, adpated to `MaskedArray`. – Pierre GM Sep 11 '12 at 15:27