An Alternative to Floating-Point for Storing Simple Fractional Values

Question

Firstly, the problem I'm trying to solve is coming up with a better representation for values that will always remain uniformly distributed in the range:

0.0 <= x < 1.0

The motivation for this is to attempt to reduce the number of bytes used to store this data (the application is heavily memory and I/O bandwidth bound). Currently a 32-bit floating-point representation is used, 16-bit floating-point is proving insufficiently accurate.

My initial thoughts are to try and store the data in a 16-bit integer and to simply use the scheme:

x/(2^16 - 1) [x is an unsigned short]

To keep the algorithms largely the same and to retain use of the same floating-point hardware operations (at least at first), I would ideally like to keep converting this fractional representation into floating-point representation, performing the operation(s), then converting back into fractional representation for storage.

Clearly, there will be a loss of precision going back and forth between these two quite different, imprecise representations, but for our application, I suspect this might be an acceptable tradeoff.

I've done some research looking at what is currently out there that might give us a good starting point. The seminal "What Every Computer Scientist Should Know About Floating-Point Arithmetic" article (http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html) led me to look at a few others, "Beyond Floating Point" (home.ccil.org/~cowan/temp/p319-clenshaw.pdf) being one such example.

Can anyone point me to other examples of representations that people have used elsewhere that might satisfy these requirements?

I'm concerned that any potential gain in exactness of representation (we're wasting much of the floating-point format currently by using this specific range) will be completely out-weighed by the requirement to round twice going from fractional representation to floating-point and back again. In which case, it may be required to do arithmetic using this fractional representation directly to get any benefit out of this approach. Any advice on this point would be helpful?

You've read about [fixed-point arithmetics](http://en.wikipedia.org/wiki/Fixed-point_arithmetic)? — Some programmer dude, Nov 10 '13 at 13:31
There was an ACCU Overload series about this: http://accu.org/index.php/journals/1717 — doctorlove, Nov 10 '13 at 13:32
A representation of `1/x` will fail to meet your requirement of uniform distribution. — IInspectable, Nov 10 '13 at 13:33
sounds like performance will also be an issue, converting back and forth. — AndersK, Nov 10 '13 at 13:53
Apologies, wrote down completely the wrong representation - should have been x/(2^16 - 1). That should have uniform distribution in the correct range. I've updated the post. — Dan, Nov 10 '13 at 14:11
It's my understanding that fixed-point is for a fixed number of digits after the decimal, I don't think this is quite the same as this fractional representation (though I guess it could be depending on the denominator)? — Dan, Nov 10 '13 at 14:12

score 4 · Accepted Answer · answered Nov 10 '13 at 15:10

Don't use 2^16-1. Use 2^16. Yes, you will have very slightly less precision and waste your 0xFFFF, but you will guarantee that there is no loss in precision when converting to floating point. (In contrast, when converting away from floating point, you will lose 8 bits of mantissal precision.)

Round-trip conversions between precisions can cause problems with certain operations, in particular progressively summing numbers. If at all possible, treat your fixed-point values as "dirty", and don't use them for further floating-point computations; prefer recalculating from inputs to using intermediate results which are in fixed-point form.

Alternatively, use 24 bits. With this representation, you will lose no precision in either direction as long as your values don't underflow (that is, as long as they're above 2^-24).

score 1 · Answer 2 · answered Nov 10 '13 at 13:38

1

Wouldn't 1/x be badly distributed in your range? 1/2 1/3 1/4 .. do you not want to represent numbers above 1/2?

This kind of thing is done in Netcdf quite a lot to encode data for saving space.

const double scale = 1.0/65536;
unsigned short x;

Any number in x is really x*scale

See example in NetCDF for a more general approach using scale and offset: http://www.unidata.ucar.edu/software/thredds/current/netcdf-java/tutorial/NetcdfDataset.html

answered Nov 10 '13 at 13:38

Dov

8,000
8
46
75

Yes apologies, wrote down the wrong representation. This does sound exactly like what I'm trying to achieve though. Thanks for the link. – Dan Nov 10 '13 at 14:13
to convert to unsigned short: (unsigned short)(f * 65536); to convert from: f = x * scale; – Dov Nov 12 '13 at 11:21

score 0 · Answer 3 · answered Oct 19 '15 at 21:37

0

Have a look at "Packed Data Values" section of this page:

https://www.unidata.ucar.edu/software/netcdf/docs/BestPractices.html#Packed%20Data%20Values

answered Oct 19 '15 at 21:37

John Caron

1,367
1
10
15

An Alternative to Floating-Point for Storing Simple Fractional Values

3 Answers3