3

Got another Erlang binary representation query ('coz that's what I am reading about these days, and need for binary protocol implementation).

If I understand the type-specifier properly, then, for a "float" type value the 8 byte representation seems fine (this is on 64-bit Win7).

1> <<A1/binary>> = <<12.3214/float>>.
<<64,40,164,142,138,113,222,106>>

However what stumped me, was that the "integer" type value's binary representation.

2> <<A2/binary>> = <<32512/integer>>.
<<0>>
3> <<A3/binary>> = <<232512518/integer>>.
<<6>>
4> <<A5/binary>> = <<80/integer>>.
<<"P">>

Why are all of those represented in 1 byte ? Can someone please explain this ?

bdutta74
  • 2,798
  • 3
  • 31
  • 54

1 Answers1

10

You're not converting Erlang terms to their binary representation. You're using the binary syntax to build binaries. Using integer will truncate them to fit into one byte:

3> <<255/integer>>. % Under one byte
<<"ÿ">>
4> <<256/integer>>. % "Over" one byte
<<0>>

Try this:

5> term_to_binary(32512).
<<131,98,0,0,127,0>>
6> term_to_binary(232512518).
<<131,110,4,0,6,220,219,13>>
7> term_to_binary(80).
<<131,97,80>>

131 is the version number, 97 is a small integer (8-bit), 98 is a big integer (32-bit), 110 (and 111) are for bignum integers. The rest is the data for the actual numbers.

See the documentation for the Erlang binary term format for further info on what the bytes mean.

Adam Lindberg
  • 16,447
  • 6
  • 65
  • 85
  • thanks for explaining. This is one of my "slow days", so please bear with me. so when we write <<255/integer>>, it is like explicitly saying that we mean the number 255 as a 8-bit (unsigned) integer. that explains why I can do these -- <<255:16/integer>> and <<255:32/integer>>, to turn them into explict 2-byte and 4-byte little-endian representation, and probably useful for directly working with CANbus data (that is little-endian). Probably not such a good idea ? I was hoping to leverage the binary representation to populate protocol headers. – bdutta74 Jul 05 '11 at 13:17
  • 1
    Well, working with binary protocols you should do it in exactly that way. That's how you would create binaries in Erlang with different sized integers in them. The Erlang binary term format is something completely different, made for communicating between Erlang nodes or other systems talking "Erlang". See http://bert-rpc.org/ – Adam Lindberg Jul 05 '11 at 14:10
  • 1
    @icarus74: the default if you just specify e.g. `<> is big-endian not little-endian (many protocols use big-endian = network byte order). But you can specify the endianness as big, little or native. See http://www.erlang.org/doc/reference_manual/expressions.html#id77409 – Peer Stritzinger Jul 05 '11 at 14:31
  • @peer-stritzinger thanks for the comment. When I do -- 1> <> = <<3100:64/integer>>. It returns -- <<0,0,0,0,0,0,12,28>> – bdutta74 Jul 05 '11 at 17:16
  • @peer-stritzinger ... looks like my comment post was truncated for some reason. What I meant with that last example is that, the emulator seems to indicate the native ordering to be little-endian. It's on Win7 64bit, though it shouldn't cause this difference. – bdutta74 Jul 06 '11 at 06:28
  • @adam-lindberg, thanks for the link to bert-rpc.org ! Still need to cover some distance with Erlang, but I am sure it'd be useful. I'll have a need for some kind of a message-bus, and was vaguely considering ZeroMQ for communicating between hetrogeneous set of nodes (some being C/C++ apps, and few Java apps). – bdutta74 Jul 06 '11 at 06:32
  • 1
    @icarus but your example shows big endian encoding see http://en.wikipedia.org/wiki/Endianness – Peer Stritzinger Jul 10 '11 at 19:15