1

In the SEG-D seismic data format, some header parameters are formatted as three bytes, two’s complement, signed binary. All values are big-endian.

Using String#unpack, Ruby core can only convert 16-bit and 32-bit values, but not 24-bit.

How can I get the binary values converted into integers in the following two’s complement way:

"\x00\x00\x00" => 0x000000 (0)
"\x00\x00\x01" => 0x000001 (1)
"\x7F\xFF\xFF" => 0x7FFFFF (8388607)
"\xFF\xFF\xFF" => -0x000001 (-1)
"\x80\x00\x00" => -0x800000 (-8388608)
sawa
  • 165,429
  • 45
  • 277
  • 381
Victor
  • 1,680
  • 3
  • 22
  • 40

1 Answers1

3

Convert the fist byte as signed 8-bit (two’s complement) and the second and third as unsigned 16-bit.

# String to be converted (-8388608)
bin = "\x80\x00\x00"

# Convert as signed 8-bit and unsigned 16-bit (big-endian)
values = bin.unpack("c S>")

# Add with the first byte weighted
converted = values[0] * 2**16 + values[1]

Alternate version of the last line using bitwise operations shift and OR (probably more efficient):

converted = values[0] << 16 | values[1]
Victor
  • 1,680
  • 3
  • 22
  • 40
  • 3
    I have great respect for anyone who understands `pack/unpack`. I've tried more than once to master it but always give up after a few minutes with a splitting headache. Consider the following tweak: `values[0] << 16 + values[1]`. See [Integer#<<](http://ruby-doc.org/core-2.4.0/Integer.html#method-i-3C-3C). – Cary Swoveland Jul 29 '18 at 19:00
  • @CarySwoveland It is indeed more elegant but I only use `<<` and other bitwise operations in private. They are quite cryptic for beginners. :-) – Victor Jul 29 '18 at 23:53
  • If a Rubiest understands `pack/unpack` it's a reasonable assumption that they are aware of `Integer#<<` (as well as the subtleties of metaprogramming). – Cary Swoveland Jul 30 '18 at 00:30
  • 2
    @CarySwoveland speaking of subtleties: `(values[0] << 16) + values[1]` ;-) – Stefan Jul 30 '18 at 08:32
  • @Stefan, I did test, and got the same result, but did not notice that `values[1] #=> 0`. – Cary Swoveland Jul 30 '18 at 16:32
  • 1
    @CarySwoveland another option is to use binary OR instead of plus, i.e. `values[0] << 16 | values[1]`. Or `a, b = bin.unpack('c S>')` and `a << 16 | b`. Or for a one-liner `bin.unpack('c S>').yield_self { |a, b| a << 16 | b }` – Stefan Jul 31 '18 at 09:29