13

I am looking for a way to round a numpy array in a more intuitive fashion. I have some of several floats, and would like to limit them to only a few decimal places. This would be done as such:

>>>import numpy as np
>>>np.around([1.21,5.77,3.43], decimals=1)
array([1.2, 5.8, 3.4])

Now the problem arises when trying to round numbers that are exactly between the rounding steps. I would like 0.05 rounded to 0.1, but np.around is set to round to the "nearest even number". This produces the following:

>>>np.around([0.55, 0.65, 0.05], decimals=1)
array([0.6, 0.6, 0.0])

My question then amounts to, what is the most effective way to round to the nearest number, and not simply the nearest even number.

For more info on np.around, see its documentation.

pirtle
  • 195
  • 1
  • 2
  • 7
  • python round() instead of numpy.around()? – hifkanotiks Aug 15 '12 at 19:06
  • 5
    0.05 is _exactly_ the same distance from 0.0 and 0.1; neither is the nearest. The reason for the "nearest even number" rule is to reduce the overall error. – MRAB Aug 15 '12 at 19:08
  • 1
    yes, this behavior is the IEEE standard for floats. Also, if you know you'll always be working with floats of a certain precision, python has a `decimal` type – Ryan Haining Aug 15 '12 at 19:09
  • 2
    Why do you need to round them? Just to show some results without unnecessary decimals? – jorgeca Aug 15 '12 at 19:27

1 Answers1

8

The way around does this is correct, but if you want to do something different, you could, for example, subtract an amount much less than the rounding precision, for example,

def myround(a, decimals=1):
     return np.around(a-10**(-(decimals+5)), decimals=decimals)

In [22]: myround(np.array([ 1.21,  5.77,  3.43]), 1)
Out[22]: array([ 1.2,  5.8,  3.4])

In [23]: myround(np.array([ 0.55,  0.65,  0.05]), 1)
Out[23]: array([ 0.5,  0.6,  0. ])

The reason I chose 5 here, was that by not including the even/odd distinction, you're implicitely introducing an average error of about 10**(-(decimal+1))/2 so you shouldn't complain about an explicit error of 1/10000th of that error.

tom10
  • 67,082
  • 10
  • 127
  • 137
  • Could you explain a bit more about what you mean by introducing a higher error rate? – Will Jul 09 '13 at 01:04
  • @Will: Could you be more explicit with your question? For example, I don't see where I mention "introducing a higher error rate", and don't know what you mean by that phrase. – tom10 Jul 09 '13 at 17:38
  • @tom10 I meant this "The reason I chose 5 here, was that by not including the even/odd distinction, you're implicitely introducing an average error of about 10**(-(decimal+1))/2 so you shouldn't complain about an explicit error of 1/10000th of that error." – Will Jul 10 '13 at 09:21
  • @Will: For numbers like 1.23456, the OP (originally) didn't like rounding based on the parity of the digit to the right of the 5 (in this case 6, which is even), and he suggested not using this approach. I pointed out that not using parity would introduce an error, and suggested an alternate method, which still introduced an error but where my error would have been 10^5 (or 100,000) times less than the OP's no parity approach. This then, really, just makes it clear that it's better to use the parity approach, which doesn't introduce an explicit error. – tom10 Jul 10 '13 at 15:04
  • Wow, this is crazy! Why would you round numbers in the middle to the next even number and not up by default, as everyone in the real world is doing it? This means that an even progression of 0, 0.1, 0.2... rounded to full numbers will give you 6 zeros, 9 Ones, 11 Twos, 9 Threes ... not nice. – Zak Nov 25 '14 at 20:57
  • ```st = np.arange(0,10.1,0.1); stl = np.floor(st) + ((st-np.floor(st))>0.5)*1; ste = np.round(st,0); stu = np.floor(st) + ((st-np.floor(st))>=0.5)*1; [(np.sum(s) -np.sum(st), np.bincount(s.astype(int))) for s in [stl,ste,stu]]``` gives ```[(-5.0, array([ 6, 10, 10, 10, 10, 10, 10, 10, 10, 10, 5], dtype=int64)), (0.0, array([ 6, 9, 11, 9, 11, 9, 11, 9, 11, 9, 6], dtype=int64)), (5.0, array([ 5, 10, 10, 10, 10, 10, 10, 10, 10, 10, 6], dtype=int64))]``` which is mathematically more correct – Glen Fletcher Mar 28 '17 at 03:40
  • Also see https://en.wikipedia.org/wiki/Rounding#Tie-breaking, rounding to even is the correct method as defined by the floating-point standard, the more convention rounding up is not really for mathematically usage, rather every day usage and a similar method is needed, to allow for the fact that most people using the system will have poor mathematically ability, in more advanced mathematical course i.e. university such topics are covered, and numpy is designed for scientific usage, and the concept shouldn't be a problem for its primary users. – Glen Fletcher Mar 28 '17 at 03:58