12

I have a project where a function receives four 8-bit characters and needs to convert the resulting 32-bit IEEE-754 float to a regular Perl number. It seems like there should be a faster way than the working code below, but I have not been able to figure out a simpler pack function that works.

It does not work, but it seems like it is close:

$float = unpack("f", pack("C4", @array[0..3]);  # Fails for small numbers

Works:

@bits0 = split('', unpack("B8", pack("C", shift)));
@bits1 = split('', unpack("B8", pack("C", shift)));
@bits2 = split('', unpack("B8", pack("C", shift)));
@bits3 = split('', unpack("B8", pack("C", shift)));
push @bits, @bits3, @bits2, @bits1, @bits0;

$mantbit = shift(@bits);
$mantsign = $mantbit ? -1 : 1;
$exp = ord(pack("B8", join("",@bits[0..7])));
splice(@bits, 0, 8);

# Convert fractional float to decimal
for (my $i = 0; $i < 23; $i++) {
    $f = $bits[$i] * 2 ** (-1 * ($i + 1));
    $mant += $f;
}
$float = $mantsign * (1 + $mant) * (2 ** ($exp - 127));

Anyone have a better way?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Dan Littlejohn
  • 1,329
  • 4
  • 16
  • 30
  • 1
    I'm intrigued that your top snippet "doesn't work but is close" -- can you pinpoint the differences? E.g. by taking the result of unpack() and converting it back to the 4 bytes, then looking for bits that are different between input and final output? – j_random_hacker Apr 20 '09 at 22:48

2 Answers2

15

I'd take the opposite approach: forget unpacking, stick to bit twiddling.

First, assemble your 32 bit word. Depending on endianness, this might have to be the other way around:

my $word = ($byte0 << 24) + ($byte1 << 16) + ($byte2 << 8) + $byte3;

Now extract the parts of the word: the sign bit, exponent and mantissa:

my $sign = ($word & 0x80000000) ? -1 : 1;
my $expo = (($word & 0x7F800000) >> 23) - 127;
my $mant = ($word & 0x007FFFFF | 0x00800000);

Assemble your float:

my $num = $sign * (2 ** $expo) * ( $mant / (1 << 23));

There's some examples on Wikipedia.

  • Tested this on 0xC2ED4000 => -118.625 and it works.
  • Tested this on 0x3E200000 => 0.15625 and found a bug! (fixed)
  • Don't forget to handle infinities and NaNs when $expo == 255
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
NickZoic
  • 7,575
  • 3
  • 25
  • 18
  • Very nice. This works and is twice as fast for my test case and the answers are correct (tested about 100k unique numbers). In the future I will have to try and get my hands dirty and try doing some bit operations. Thanks! – Dan Littlejohn Apr 20 '09 at 23:18
  • No worries, enjoy. Should be plenty quick. PS: "when $expo == 255" should read "when $expo == 128" ... I forgot the offset. – NickZoic Apr 21 '09 at 09:43
  • my $word = unpack("N", $bytes); should be far faster – Hynek -Pichi- Vychodil Apr 21 '09 at 11:26
  • I tried pack at nauseum. The rounding of small numbers like e-39 looks to be the problem and Sharky's code works. – Dan Littlejohn Apr 21 '09 at 18:53
  • 1
    CAUTION: will not work if perl is compiled with a different floating point implementation. – Brad Gilbert Apr 22 '09 at 17:07
  • 1
    Ummm, actually it should correctly unpack IEEE-754 floats no matter what the perl float implementation. That's the whole point, no? – NickZoic Apr 23 '09 at 02:30
5

The best way to do this is to use pack().

my @bytes = ( 0xC2, 0xED, 0x40, 0x00 );
my $float = unpack 'f', pack 'C4', @bytes;

Or if the source and destination have different endianness:

my $float = unpack 'f', pack 'C4', reverse @bytes;

You say that this method "does not work - it seems like it is close" and "fails for small numbers", but you don't give an example. I'd guess that what you are actually seeing is rounding where, for example, a number is packed as 1.234, but it is unpacked as 1.23399996757507. That isn't a function of pack(), but of the precision of a 4-byte float.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
jmcnamara
  • 38,196
  • 6
  • 90
  • 108
  • 1
    You could well be right. On the other hand, my code would produce the exact same errors, I'd think. pack/unpack 'f' does conversions to/from "A single-precision float in the native format." ... maybe whatever Dan's running on isn't *quite* IEEE-754? – NickZoic Apr 21 '09 at 09:59
  • 1
    "My code would produce the exact same errors". Your code *does* produce the same results as the pack() method but it isn't a error, just rounding. For example try unpacking 0x3F9DF3B6 which is 1.234 packed as a float using your method and the pack method. Also, the "native" float format of the vast majority of systems that perl runs on these days is IEEE 754. If the OP was on a system with a different float format, it would be unusual enough that he would know about it. – jmcnamara Apr 21 '09 at 12:23
  • The rounding is a problem for something like e-39 – Dan Littlejohn Apr 21 '09 at 18:56
  • 2
    1E-39 is in the subnormal (denormal) range of a 4 byte float so you going to encounter additional precision issues. Out of interest could you post an example that converts correctly with the bit-shifting method but not with unpack and also say what hardware you are on? – jmcnamara Apr 22 '09 at 08:14
  • I have just used it in one of my Perl scripts (extracting information from some [CAN bus](http://en.wikipedia.org/wiki/CAN_bus) traffic), and ***it works***. This should be the accepted answer; it is much better than using bit twiddling. – Peter Mortensen Aug 04 '17 at 22:02