3

why the order of multiplications can impact the results? Consider the following code

a=47.215419672114173
b=-0.45000000000000007
c=-0.91006620964286644
result1=a*b*c
temp=b*c
result2=a*temp
result1==result2

We all know that result1 should be equal to result2, however we get:

result1==result2 #FALSE!

the difference is minimal

result1-result2 #3.552713678800501e-15

However, for particular applications this error can amplify so that the output of two programs doing the same computations (one using result1 and the other result2) can be completely different.

Why is this so and what can be done to address such issues in heavily numerical/scientific applications?

Thanks!

UPDATE

Good answers, but I still miss the reason why the order of multiplication matters, e.g.

temp2=a*b
result3=temp2*c
result1==result3 #True

So it seems that the compiler/interpreter sees a*b*c as (a*b)*c

Mannaggia
  • 4,559
  • 12
  • 34
  • 47
  • 3
    Check out the [Decimal module](http://docs.python.org/library/decimal.html) – Joel Cornett Jul 24 '12 at 20:23
  • not 100% sure but it has something to do with the length of the number? ie how many digits? I don't think python can hold so many digits and thus if you compute `b*c` it might round the last number, which would affect the `a*temp` – hammythepig Jul 24 '12 at 20:24
  • 9
    Comparing machine representations of floating point numbers for *exact* equality makes the Universe unhappy, my friend. – Keith Flower Jul 24 '12 at 20:25
  • 3
    @Mannaggia - here are some links you may find useful in understanding the behavior you are noting: [The Perils of Floating Point](http://www.lahey.com/float.htm), [Floating Point Arithmetic: Issues and Limitations](http://docs.python.org/tutorial/floatingpoint.html) – Keith Flower Jul 24 '12 at 20:45
  • 2
    @Marcin: ok but then what is the role of this blog if one can only ask questions in the field he/she masters? My question was about the _ordering_ of operations and how this affects precision, I know that real numbers cannot be represented with a binary form, and this has **nothing to do** with any IEEE standard (which is just a convention) but the binary (or trinary if you want) coding system. – Mannaggia Jul 24 '12 at 20:47
  • 1
    As others have noted, [order of evaluation of floating point calculations](http://cnx.org/content/m32754/latest/) will be problematic regardless of computer language. – Keith Flower Jul 24 '12 at 20:59
  • 2
    @Marcin His original question was something that is NOT listed in a text book -- if he even knew where to find such information in the first place -- as his question was why a set of computations that he thought where performing the same hardware operation gave slightly different results. The answer requires a knowledge of both operation ordering and hardware inaccuracies. This seems like a perfectly reasonable question to me. – Pyrce Jul 24 '12 at 21:13
  • @Pyrce Really? This elementary stuff, dealt with at the most introductory stage of computer scientific education. I would expect any introductory textbook on computing, or on pretty much any topic related to this, to contain the answers to this question. – Marcin Jul 24 '12 at 21:17
  • 2
    @Marcin They will say that there are inaccuracies in floating point computation. But I remember having the delve quite a bit into EE courses before I realized the ordering of the same computations can change the floating point value slightly. I don't remember any beginning computer science book saying floating numbers a,b,c have the property: a*b*c != c*b*a -- though I am sure you could find one somewhere that talks about this briefly. – Pyrce Jul 24 '12 at 21:20

9 Answers9

9

All programming languages lose precision when converting floating point numbers from decimal representation to binary representation. This results in inaccurate calculations (at least from a base 10 perspective, since the math is actually being done on floating point values represented in binary), including cases where order of operations changes the result. Most languages provide a datastructure to maintain base 10 precision, at the cost of performance. Look at Decimal in Python.

Edit:

In answer to your update, not exactly. Computers do things in order, so when you provide them a sequence of operations, they proceed 1 by 1 through the sequence. There's no explicit order of operations thing going on beyond sequential command processing.

Silas Ray
  • 25,682
  • 5
  • 48
  • 63
  • As an example of a floating point error, try `1-0.9`. This turns out to be `0.09999999999999998`, and not `0.1` – MartinHaTh Jul 24 '12 at 20:26
  • 1
    "This results in inaccurate calculations" To be nitpicky, I wouldn't say that the calculations are inaccurate, but rather that they are merely approximations of calculations with real numbers. – Marcin Jul 24 '12 at 20:43
  • @sr2222 Way better, actually. – Marcin Jul 24 '12 at 21:04
6

When you use floating point numerals in any programming language, you will lose precision. You can either:

Accomodate for the loss of precision, and adjust your equality checks accordingly, as follows:

 are_equal = (result1-result2)>0.0001

Where the 0.0001 (epsilon) is a value you set.

Or use the Decimal class provided with python, which is a bit slower.

Lanaru
  • 9,421
  • 7
  • 38
  • 64
6

Each multiplication results in twice as many digits (or bits) as the original numbers and needs to be rounded so that it will fit back into the space allocated for a floating point number. This rounding can potentially change the results when you rearrange the order.

Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
  • thanks, this answers conceptually the key question, but also thanks to @sr2222, a pity there can be only one correct answer :-) – Mannaggia Jul 24 '12 at 21:33
  • 1
    @Mannaggia, my answer is true even when the values can be represented exactly in binary. – Mark Ransom Jul 25 '12 at 12:56
  • yes, because of the lost precision due to having less bits when the result of the computation is put back into memory, right? – Mannaggia Jul 25 '12 at 14:09
  • 1
    @Mannaggia, it has nothing to do with memory, it's the number of bits available in the processor itself. A floating point value on an x86 for example can hold 53 bits in the significand, but the multiplication produces 106 bits. The internal register can hold 64 bits of significand but that still isn't enough, and the result must be rounded even before it's written to memory. – Mark Ransom Jul 25 '12 at 14:22
3

float comparison should always be done (by you) with a small epsilon like 10^-10

Verena Haunschmid
  • 1,252
  • 15
  • 40
1

We all know that result1 should be equal to result2, however we get:

No, we don't all know that. In fact, they should not be equal, which is why they aren't equal.

You seem to believe that you are working with real numbers. You aren't - you are working with IEEE floating point representations. They don't follow the same axioms. They aren't the same thing.

The order of operations matters because python evaluates each expression, which results in a floating point number.

Marcin
  • 48,559
  • 18
  • 128
  • 201
  • 1
    of course, I meant to say that symbolically the two expressions are equal – Mannaggia Jul 24 '12 at 20:27
  • 1
    @Mannaggia No, symbolically the two expressions aren't equal. Which is why they aren't equal. Once again: you are not working with real numbers. – Marcin Jul 24 '12 at 20:29
  • well then why this is true: temp2=a*b result3=temp2*c result1==result3 – Mannaggia Jul 24 '12 at 20:33
  • 1
    @Mannaggia It's true because those calculations have the same result. Just because one calculation works the way you believe it should, it does not mean that all calculations should work out the way you think it should. Once again, these are not real numbers, operations on them do not follow the same laws. – Marcin Jul 24 '12 at 20:38
  • 1
    Just because some floating point operations are commutative doesn't mean all are. When working with floating point numbers, commutative behavior is a goal, not a guarantee. – Silas Ray Jul 24 '12 at 20:39
1

Why: probably your machine/Python cannot handle that amount of accuracy. See: http://en.wikipedia.org/wiki/Machine_epsilon#Approximation_using_Python

What to do: This should help: http://packages.python.org/bigfloat/

Levon
  • 138,105
  • 33
  • 200
  • 191
Justin Harris
  • 1,969
  • 2
  • 23
  • 33
1

Representing numbers in computers is a big research area in computer science. It is not a problem present only in python but any programming language has this property, since by default it would be too expensive to perform ANY calculation arbitrary accurate.

The numerical stability of an algorithm reflects some of the limitations while thinking numerical algorithms. As said before, Decimal is defined as a standard to perform precise calculations in banking applications or any application that might need it. In python, there's an implementation of this standard.

gutes
  • 152
  • 5
0

As answered well in previous posts, this is a floating point arithmetic issue common in programming languages. You should be aware never to apply exact equality to float types.

When you have such comparisons, you can employ a function that compares based on a given tolerance (threshold). If the numbers are close enough, they should be considered equal number-wise. Something like:

def isequal_float(x1,x2, tol=10**(-8)):
    """Returns the results of floating point equality, according to a tolerance."""
    return abs(x1 - x2)<tol

will do the trick. If I'm not mistaken, the exact tolerance depends on whether the float type is single- or double-precision and this depends on the language you're using.

Using such a function allows you to easily compare the results of calculations, for instance in numpy. Let's take the following example for instance, where the correlation matrix is calculated for a dataset with continuous variables, using two ways: the pandas method pd.DataFrame.corr() and the numpy function np.corrcoef():

import numpy as np
import seaborn as sns 

iris = sns.load_dataset('iris')
iris.drop('species', axis = 1, inplace=True)

# calculate correlation coefficient matrices using two different methods
cor1 = iris.corr().to_numpy()
cor2 = np.corrcoef(iris.transpose())

print(cor1)
print(cor2)

The results seem similar:

[[ 1.         -0.11756978  0.87175378  0.81794113]
 [-0.11756978  1.         -0.4284401  -0.36612593]
 [ 0.87175378 -0.4284401   1.          0.96286543]
 [ 0.81794113 -0.36612593  0.96286543  1.        ]]
[[ 1.         -0.11756978  0.87175378  0.81794113]
 [-0.11756978  1.         -0.4284401  -0.36612593]
 [ 0.87175378 -0.4284401   1.          0.96286543]
 [ 0.81794113 -0.36612593  0.96286543  1.        ]]

but the results of their exact equality are not. These operators:

print(cor1 == cor2)
print(np.equal(cor1, cor2))

will yield mostly False results element-wise:

[[ True False False False]
 [False False False False]
 [False False False False]
 [False False False  True]]

Likewise, np.array_equal(cor1, cor2) will also yield False. However, the custom-made function gives the comparison you want:

out = [isequal_float(i,j) for i,j in zip(cor1.reshape(16, ), cor2.reshape(16, ))]
print(out)

[True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True]

Note: numpy includes the .allclose() function to perform floating point element-wise comparisons in numpy arrays.

print(np.allclose(cor1, cor2))
>>>True
dbouz
  • 779
  • 9
  • 14
0

Some great answers here about how to deal with floating point arithmetic. But you seem to be asking more specifically why a*b*c != b*c*a [result1 != result2]. The answer is simple: floating point arithmetic is not guaranteed to be associative.

When you assigned temp = b*c the computer already made an imprecise calculation (because it truncated) and the error propagated to result2 = a*temp. On the other hand when you calculated Result1 = a*b*c the error started with the intermediate result a*b and propagated to *b. That's why for e.g. if you sum the number 0.11 10k times you will have a lot more imprecision than if multiplied 0.11 * 10k, because the error propagates over many operations.

If you want some more in depth knowledge about this topic, you can read the Python doc about floating points, the article What Every Computer Scientist Should Know About Floating-Point Arithmetic or any introduction to numerical analysis/methods available on many courses/books.