2

Context: I'm reading/writing data, which for storage reason comes as 24-bit integers (signed or unsigned doesn't matter as they're actually 8 octal values). I need to store/read a large number of these integers with pack and unpack. The application is space-critical so using 32-bit integers is not desirable.

However, pack doesn't seem to have an option for 24-bit integers. How does one deal with this? I currently use a custom function

function pack24bit($n) {
    $b3 = $n%256;
    $b2 = $n/256;
    $b1 = $b2/256;
    $b2 = $b2%256;
    return pack('CCC',$b1,$b2,$b3);
}

and

function unpack24bit($packed) {
    $arr = unpack('C3b',$packed);
    return 256*(256*$arr['b1']+$arr['b2'])+$arr['b3'];
}

but maybe there are more direct ways?

user1111929
  • 6,050
  • 9
  • 43
  • 73
  • 1
    Could you use the format string "CS" which would give you 24 bits? – Graeme Apr 23 '16 at 20:15
  • be aware php int size is platform dependant. They could be 32 bit. They could also be 64 bit – nl-x Apr 23 '16 at 21:10
  • @nl-x Could you explain how that remark relates to my question? I don't think this matters when 24 bits is the most I'm working with. – user1111929 Apr 23 '16 at 21:16

1 Answers1

2

There is no such thing as a 24-bit integer on any modern CPU that I'm aware of, which is why your desired packing is not directly supported.

I recommend packing your bytes individually, as you suggested. Be mindful of endianness.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • Could you specify what you mean by 'Be mindful of endianess'? Looking at http://php.net/manual/en/function.pack.php it seems that 'C' is my only flag option, there's no room for specification of endianess. – user1111929 Apr 23 '16 at 20:51
  • @user1111929: Exactly. You have to manage that yourself when you're splitting your data into bytes then joining it up again at the other end: at some point you have to decide what the code to do those things will look like. It should be enough to just document which endianness you've decided to use in your packed byte representation. – Lightness Races in Orbit Apr 23 '16 at 20:55
  • I understand the why, but I don't understand the how. How do I 'decide' which endianess to use? It seems to me like the system automatically decides the endianess for me and doesn't offer me any choice in that. Which indeed would have desastrous effect when a future system interprets it the other way around, but I don't see how to counter that given I only have the 'C' flag. – user1111929 Apr 23 '16 at 20:59
  • @user1111929: No choice? Hint: `pack('CCC',$b1,$b2,$b3)` and `pack('CCC',$b3,$b2,$b1)` are not the same thing :) Although I don't quite understand your encoding scheme. You'd be better off with some bitshifts... – Lightness Races in Orbit Apr 23 '16 at 21:01
  • Ooooh endianess in that way. I thought you meant that the individual bits within $b1 could be reversed on another system. Derp. :p – user1111929 Apr 23 '16 at 21:03