-1

I'm writing a method that receives an instance of bytes::Bytes representing a Type/Length/Value data structure where byte 0 is the type, the next 4 the length and the remaining the value. I implemented a unit test that is behaving a very unexpected way.

Given the method:

fn split_into_packets(packet: &Bytes) -> Vec<Bytes> {
    let mut packets: Vec<Bytes> = Vec::new();
    let mut begin: usize = 1;
    while begin < packet.len() {
        let slice = &packet[1..5];
        print!("0: {} ", slice[0]);
        print!("1: {} ", slice[1]);
        print!("2: {} ", slice[2]);
        println!("3: {}", slice[3]);
        let size = u32::from_be_bytes(pop(slice));
        println!("{}", size);
    }
    return packets;
}

And the test:

let mut bytes = BytesMut::with_capacity(330);
bytes.extend_from_slice(b"R\x52\x00\x00\x00\x08\x00");
let packets = split_into_packets(&bytes.freeze());

I see the following on my console:

0: 82 1: 0 2: 0 3: 0

I expected it to be:

0: 0 1: 0 2: 0 3: 82

What's going on? What am I missing?

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
ruipacheco
  • 15,025
  • 19
  • 82
  • 138
  • 3
    Yes, it's called [endianness](https://en.wikipedia.org/wiki/Endianness) – Alexey S. Larionov Jun 09 '20 at 19:11
  • 1
    It's hard to answer your question because it doesn't include a [MRE]. We can't tell what crates (and their versions), types, traits, fields, etc. are present in the code. It would make it easier for us to help you if you try to reproduce your error on the [Rust Playground](https://play.rust-lang.org) if possible, otherwise in a brand new Cargo project, then [edit] your question to include the additional info. There are [Rust-specific MRE tips](//stackoverflow.com/tags/rust/info) you can use to reduce your original code for posting here. Thanks! – Shepmaster Jun 09 '20 at 19:15
  • You are using `from_be_bytes` — why did you pick that function? – Shepmaster Jun 09 '20 at 19:16
  • @Shepmaster That function does not matter afaik. It's a leftover from my production code. – ruipacheco Jun 09 '20 at 19:42
  • @AlexLarionov what do you mean? – ruipacheco Jun 09 '20 at 19:44
  • @Shepmaster This code runs as is, not sure how to make it more concise. I've linked to the package I use for byte::Bytes. – ruipacheco Jun 09 '20 at 19:45
  • *This code runs as is* — it [literally does not](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=40590cfbef705c6de60c98eac65d189e). – Shepmaster Jun 09 '20 at 20:02
  • Depending on platform, numbers might be stored as big endian (most significant byte of a number is first) or little endian (most significant byte is last). You've got little endian, so numbers are stored from least significant byte to most significant – Alexey S. Larionov Jun 09 '20 at 20:03
  • *That function does not matter* — there's a reason I focused on that function. If you read the documentation for it in combination with the other comments, you might have a eureka moment. – Shepmaster Jun 09 '20 at 20:04
  • 2
    @AlexLarionov I'd quibble and say that platform is less important than the format of the data in this case. – Shepmaster Jun 09 '20 at 20:05
  • @Shepmaster how does that function affect the printing of values? Removing it changes nothing. – ruipacheco Jun 09 '20 at 20:28
  • 1
    If “Removing it changes nothing“ then it sells like your example isn’t actually minimal then... please refer back to my first comment. – Shepmaster Jun 09 '20 at 21:06
  • 2
    You string contains `R`, then `0x52` followed by three `0`, then `8` and `0`. You skip the first item and print the next four, So `0x52` and three zeroes. `0x52` is 82 in decimal. Why do you expect a different order? – Jmb Jun 10 '20 at 07:19
  • @Shepmaster is there a Repl that allows me to include external packages? the ones I found seem to only allow the standard library. – ruipacheco Jun 10 '20 at 17:12
  • @ruipacheco not in general, no, as doing such is a security risk. The Rust playground, linked earlier, already has the `bytes` crate pre-installed. That's the only dependency that you've stated you need. – Shepmaster Jun 10 '20 at 17:52
  • @Jmb want to turn that into an answer? I simply did not see the R. – ruipacheco Jun 10 '20 at 20:01

1 Answers1

1
fn split_into_packets(packet: &Bytes) -> Vec<Bytes> { // paket = "R\x52\x00\x00\x00\x08\x00"
    let mut packets: Vec<Bytes> = Vec::new();
    let mut begin: usize = 1;
    while begin < packet.len() {
        let slice = &packet[1..5]; // slice = "\x52\x00\x00\x00"
        print!("0: {} ", slice[0]); // "\x52\x00\x00\x00"
                                        ^^^^
                                        |  |
                                        +--+--- this is slice[0] = 0x52 = 82 (in decimal)
        print!("1: {} ", slice[1]); // "\x52\x00\x00\x00"                                 
                                            ^^^^
                                            |  |
                                            +--+--- this is slice[1] = 0x0 = 0 (in decimal)
        print!("2: {} ", slice[2]); // "\x52\x00\x00\x00"                                 
                                                ^^^^
                                                |  |
                                                +--+--- this is slice[2] = 0x0 = 0 (in decimal)
        println!("3: {}", slice[3]); // "\x52\x00\x00\x00"                                 
                                                     ^^^^
                                                     |  |
                                                     +--+--- this is slice[3] = 0x0 = 0 (in decimal)
        let size = u32::from_be_bytes(pop(slice));
        println!("{}", size);
    }
    return packets;
}

I hope the above explains why you get 82, 0, 0, 0 when printing the bytes one after another.

So, onto the next thing: How do we convert 4 bytes to an u32: To do that there are two possibilities that differ in how they interpret the bytes:

  • from_be_bytes: Converts bytes to u32 in big-endian: u32::from_be_bytes([0x12, 0x34, 0x56, 0x78])==0x12345678
  • from_le_bytes: Converts bytes to u32 in little-endian: u32::from_le_bytes([0x78, 0x56, 0x34, 0x12])==0x12345678

For endianess, you can e.g. consult the respective wikipedia page.

phimuemue
  • 34,669
  • 9
  • 84
  • 115