To me floating point errors are essentially those which on an x86 would lead to a floating point exception (assuming the coprocessor has that interrupt enabled). A special case is the "inexact" exception i e when the result was not exactly representable in the floating point format (such as when dividing 1 by 3). Newbies not yet at home in the floating-point world will expect exact results and will consider this case an error.
As I see it there are several strategies available.
- Early data checking such that bad values are identified and handled
when they enter the software. This lessens the need for testing
during the floating operations themselves which should improve
performance.
- Late data checking such that bad values are identified
immediately before they are used in actual floating point operations.
Should lead to lower performance.
- Debugging with floating point
exception interrupts enabled. This is probably the fastest way to
gain a deeper understanding of floating point issues during the
development process.
to name just a few.
When I wrote a proprietary database engine over twenty years ago using an 80286 with an 80287 coprocessor I chose a form of late data checking and using x87 primitive operations. Since floating point operations were relatively slow I wanted to avoid doing floating point comparisons every time I loaded a value (some of which would cause exceptions). To achieve this my floating point (double precision) values were unions with unsigned integers such that I would test the floating point values using x86 operations before the x87 operations would be called upon. This was cumbersome but the integer operations were fast and when the floating point operations came into action the floating point value in question would be ready in the cache.
A typical C sequence (floating point division of two matrices) looked something like this:
// calculate source and destination pointers
type1=npx_load(src1pointer);
if (type1!=UNKNOWN) /* x87 stack contains negative, zero or positive value */
{
type2=npx_load(src2pointer);
if (!(type2==POSITIVE_NOT_0 || type2==NEGATIVE))
{
if (type2==ZERO) npx_pop();
npx_pop(); /* remove src1 value from stack since there won't be a division */
type1=UNKNOWN;
}
else npx_divide();
}
if (type1==UNKNOWN) npx_load_0(); /* x86 stack is empty so load zero */
npx_store(dstpointer); /* store either zero (from prev statement) or quotient as result */
npx_load would load value onto the top of the x87 stack providing it was valid. Otherwise the top of the stack would be empty. npx_pop simply removes the value currently at the top of the x87. BTW "npx" is an abbreviation for "Numeric Processor eXtenstion" as it was sometimes called.
The method chosen was my way of handling floating-point issues stemming from my own frustrating experiences at trying to get the coprocessor solution to behave in a predictable manner in an application.
For sure this solution led to overhead but a pure
*dstpointer = *src1pointer / *src2pointer;
was out of the question since it didn't contain any error handling. The extra cost of this error handling was more than made up for by how the pointers to the values were prepared. Also, the 99% case (both values valid) is quite fast so if the extra handling for the other cases is slower, so what?