I am working on a side project at work where I would like to read/write SAS Transport files. The challenge is that numbers are encoded in 64-bit IBM floating point numbers. While I have been able to find plenty of great resources for reading a byte array (containing an IBM float) into a IEEE 32-bit floats and 64-bit floats, I'm struggling to find the code to convert floats/doubles back to IBM floats.
I recently found some code for writing a 32-bit IEEE float back out to a byte array (containing an IBM float). It seems to be working, so I've been trying to translate it to a 64-bit version. I've reversed engineered where most of the magic numbers are coming from, but I've been stumped for over a week now.
I have also tried to translate the functions listed at the end of the SAS Transport documentation to Java, but I've run into a lot of issues related to endiness, Java's lack of unsigned types, and so on. Can anyone provide the code to convert doubles to IBM floating point format?
Just to show the progress I've made, here are some shortened versions of the code I've written so far:
This grabs a 32-bit IBM float from a byte array and generates an IEEE float:
public static double fromIBMFloat(byte[] data, int offset) {
int temp = readIntFromBuffer(data, offset);
int mantissa = temp & 0x00FFFFFF;
int exponent = ((temp >> 24) & 0x7F) - 64;
boolean isNegative = (temp & 0x80000000) != 0;
double result = mantissa * Math.pow(2, 4 * exponent - 24);
if (isNegative) {
result = -result;
}
return result;
}
This is the same thing for 64-bit:
public static double fromIBMDouble(byte[] data, int offset) {
long temp = readLongFromBuffer(data, offset);
long mantissa = temp & 0x00FFFFFFFFFFFFFFL;
long exponent = ((temp >> 56) & 0x7F) - 64;
boolean isNegative = (temp & 0x8000000000000000L) != 0;
double result = mantissa * Math.pow(2, 4 * exponent - 24);
if (isNegative) {
result = -result;
}
return result;
}
Great! These work for going to IEEE floats, but now I need to go the other way. This simple implementation seems to be working for 32-bit floats:
public static void toIBMFloat(double value, byte[] xport, int offset) {
if (value == 0.0 || Double.isNaN(value) || Double.isInfinite(value)) {
writeIntToBuffer(xport, offset, 0);
return;
}
int fconv = Float.floatToIntBits((float)value);
int fmant = (fconv & 0x007FFFFF) | 0x00800000;
int temp = (fconv & 0x7F800000) >> 23;
int t = (temp & 0xFF) - 126;
while ((t & 0x3) != 0) {
++t;
fmant >>= 1;
}
fconv = (fconv & 0x80000000) | (((t >> 2) + 64) << 24) | fmant;
writeIntToBuffer(xport, offset, fconv);
}
Now, the only thing left is to translate that to work with 64-bit IBM floats. A lot of the magic numbers listed relate to the number of bits in the IEEE 32-bit floating point exponent (8-bits) and mantissa (23-bit). So for 64-bit, I just need to switch those to use the 11-bit exponent and 52-bit mantissa. But where does that 126
come from? What is the point of the 0x3
in the while
loop?
Any help breaking down the 32-bit version so I can implement a 64-bit version would be greatly appreciated.