3

To compare two double type variables in C, I have defined #define EQUALITY_EPSILON = 1e-8. I am doing the comparison as follows:

if((img_score[i] - img_score[j]) >= EQUALITY_EPSILON){
    // handle for ith score greater than jth score
}
else if((img_score[j] - img_score[i]) >= EQUALITY_EPSILON){
    // handle for ith score smaller than jth score
}
else{
    // handle for ith score equal to jth score
}

The problem I am facing is that the scores in my code are extremely small, therefore for EQUALITY_EPSILON = 1e-8, the result of comparison turns out to be equality in some cases. My question is how small can I set EQUALITY_EPSILON?

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
stressed_geek
  • 2,118
  • 8
  • 33
  • 45
  • 7
    Have you thought about using a relative epsilon ? http://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/ – cnicutar Aug 20 '12 at 12:45
  • Must read: [epsilon is not 0.0001](http://realtimecollisiondetection.net/pubs/GDC05_Ericson_Numerical_Robustness_for_Geometric_Calculations.ppt) – Damon Aug 20 '12 at 13:48
  • 1
    The errors in floating-point values depend on previous operations. Determining an acceptable error threshold requires analyzing both the previous operations (how large might the error be) and the purposes of the comparison (how large an error can be tolerated). It is impossible to give a proper answer to this question without additional information. – Eric Postpischil Aug 20 '12 at 14:21

3 Answers3

14

Floating point numbers aren't spread out evenly on the number line. They're very dense around 0, and as the magnitude increases, the 'delta' between two expressible values increases:

                               0
|      |     |    |   |  |  | ||| |  |  |   |    |     |      |

This means, for small numbers, you need to use a smaller 'epsilon'.

(EDIT: And the error, or epsilon you allow for should be in the same range as the error you already expect in the values you are comparing. i.e. Errors due to the floating point operations that produced them. See comments below for why. /EDIT).

One fair indication of the kind of 'epsilon' you need to use can be had from nextafter in math.h:

nextafter(x, y) returns the next representable value after x, when travelling along the number line, in the direction of y.

Another way to do it would be to calculate the difference in magnitude between img_score[i] and img_score[j], and seeing how small it is compared to the magnitude of img_score[i] or img_score[j]. How much smaller? You need to decide.

ArjunShankar
  • 23,020
  • 5
  • 61
  • 83
  • Thanks Arjun! If I don't need the equality check, then just a direct comparison like `if(img_score[i] > img_score[j])` should also work right? – stressed_geek Aug 20 '12 at 13:11
  • 2
    @stressed_geek Yes it will. All comparisons work 'correctly' of course. So if one value is greater (even by `nextafter` worth of difference), it will show up with `>`. – ArjunShankar Aug 20 '12 at 13:17
  • The magnitude of the numbers being compared, and the spacing of the floating-point numbers in that area, is irrelevant to the magnitude of the error produced by previous operations. This question presents no information by which the error may be bounded, and therefore no reliable advice can be given about how to compensate for the error. – Eric Postpischil Aug 20 '12 at 13:28
  • @stressed_geek - The above comment makes me think you should reconsider your question. In essence, what Eric Postpischil says is right. You must consider the kind of errors produced by your FP arithmetic. – ArjunShankar Aug 20 '12 at 13:47
2

You should not be using an absolute value for the epsilon. The value chosen should be relative to the values you're comparing.

For example, you can divide the smallest (closest to zero) by one million and use the absolute value of that.

Whether that's a suitable choice depends on what operations were performed to reach the values. It may not be suitable for all cases.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • Thanks. Could you please clarify what are you referring to when you say "smallest"? – stressed_geek Aug 20 '12 at 12:49
  • @stressed_geek: _very_ good point. I meant smallest absolute value: closest to zero, not closest to negative infinity. – paxdiablo Aug 20 '12 at 12:53
  • Got it, thanks again. Moreover, if I don't need the equality check, then just a direct comparison like `if(img_score[i] > img_score[j])` should also work right? – stressed_geek Aug 20 '12 at 13:08
  • If you're going to do this kind of silly epsilon comparison, the value of epsilon should be something like `(fabs(a)+fabs(b))*DBL_EPSILON*K` where `K` is some small constant corresponding roughly the the number of rounding steps that occurred in computing `a` and `b`. – R.. GitHub STOP HELPING ICE Aug 20 '12 at 13:16
  • The magnitude of the numbers being compared, and the spacing of the floating-point numbers in that area, is irrelevant to the magnitude of the error produced by previous operations. This question presents no information by which the error may be bounded, and therefore no reliable advice can be given about how to compensate for the error. – Eric Postpischil Aug 20 '12 at 13:29
1

The smallest absolute value for a double is 2^{-1023}, which is around 10^{-308}. But there are only 53 bits for storing the mantissa (the part where your significant digits are). If you know in advance the approximate value of the difference, then you can go almost as low as 2^{-53} times that difference for your epsilon. If you don't know in advance, you should probably build your epsilon as a percentage of one of the differences.

Note that going down to 2^{-53} times the difference is the limit caused by the limited precision in the double. If the two values you are subtracting are so close to each other that you have fewer than 53 significant bits in their difference, your epsilon could only go as low as 2^{-x} times the difference, where x is the number of significant bits in the difference.

David
  • 1,429
  • 8
  • 8