u8 as i32 vs i32::from_be_bytes([u8;4])

Question

Total bike-shedding here, but I'm wondering about the relative efficiency of

let mut buf = [0u8];
input.read(&mut buf)?;
Ok(buf[0] as i32)

vs

let mut buf = [0u8; 4];
input.read(&mut buf[3..4])?;
Ok(i32::from_be_bytes(buf))

Even on aesthetic grounds its hard to choose between these since in isolation I prefer the first set of code, but I have parallel methods for reading 16-, 24- and 32-bit integer data that follow the latter pattern which makes it more appealing in context.

@Stargateur Because when I'm reading, e.g., a 24-bit (unsigned) integer, I can have almost exactly the same code, just changing the slice to `buf[1..4]` — Don Hosek, Apr 29 '21 at 14:25
not a valid reason for me. I would advice you to use external crate for doing thing like that like https://docs.rs/nom/6.1.2/nom/number/complete/index.html — Stargateur, Apr 29 '21 at 14:27
While `nom` looks really nice it seems like overkill for my case where I need five 3-line functions for reading my data and the nom functions don't quite meet my needs in that I want an `i32` return value regardless of the size of the input data. — Don Hosek, Apr 29 '21 at 14:37
"i32 return value regardless of the size of the input data" that also look weird — Stargateur, Apr 29 '21 at 14:38
The code is reading GF data. Knuth's TeX-related binary files all do similar things where there are typically 4 forms of an opcode taking 1–4 byte arguments. The operation is the same regardless of the data size so for organization purposes, it makes sense to have a single `paint` function which will, depending on the opcode, take the return value of one of `read1` through `read4` as its argument. — Don Hosek, Apr 29 '21 at 15:53
(and the file formats date back 40+ years which is why there are all these optimizations for file size that today would be viewed as needless) — Don Hosek, Apr 29 '21 at 15:54
@DonHosek without seing complete code it's hard to know anyway, your first code is wrong do `buf[0] as i8 as i32` — Stargateur, Apr 29 '21 at 17:02
@stargateur it’s not wrong. 8-bit values are unsigned data. 0xff is 255 and not -1. — Don Hosek, Apr 29 '21 at 18:37
so you convert all value to signed when value are not signed ???? what you do don't make sense — Stargateur, Apr 30 '21 at 00:19
Read my coment that begins with "the code is reading GF data". An opcode has 1–4 bytes following it. 8, 16 and 24 bit values are encoded unsigned, 32 bit values are encoded signed. But it's a single function implementing all four opcodes. I know that XY problems are not uncommon somewhere like stackoverflow, but I *do* very much know my problem domain. — Don Hosek, Apr 30 '21 at 01:36

score 2 · Answer 1 · answered Apr 29 '21 at 15:41

Your code will not handle signed integers properly, assuming you intend to store negative values. If a file has an i8 with value -1 (0xFF), you will read it as a u8 with value 255 and cast to an i32, yielding the wrong value (see this playground). You need to perform sign extension if you want values to come back correctly.

I would suggest using the byteorder crate, which has methods for correctly reading integers in an endian-aware manner. The resulting code would look like

use byteorder::{LE, ReadBytesExt};

reader.read_i8()? as i32;
reader.read_i16::<BE>()? as i32;
reader.read_i24::<BE>()? as i32;
reader.read_i32::<BE>()?;

The code reads *unsigned* 8-, 16- and 24-bit values and *signed* 32-bit values. — Don Hosek, Apr 29 '21 at 15:48

score 2 · Accepted Answer · answered Apr 29 '21 at 15:52

2

It looks like the first one is going to be slightly more efficient. It uses a movzx instead of a bswap.

You can see it here: https://rust.godbolt.org/z/WxW1bj4Eh

It's pretty difficult for the compiler to be able to convert the bswap into movzx because it will need to see through the read to understand that the rest of the buffer is 0.

answered Apr 29 '21 at 15:52

Jeff Muizelaar

574
2
8

`movzx` is 3 cycles while `bswap` is 1 (!) according to http://jsimlo.sk/docs/cpu/index.php/movzx.html and http://jsimlo.sk/docs/cpu/index.php/bswap.html And it's apparently more efficient to initialize [u8,4] than [u8,1] – Don Hosek Apr 29 '21 at 16:02
1

I'm not sure what source those pages use for timing. `movzx` needs to be compared to `mov; bswap` and not just `bswap` alone. You can see a simulation of both compilations here: https://gcc.godbolt.org/z/rPvqK6e3E Over 100 iterations the first one uses 1703 cycles vs 1731 for the second. – Jeff Muizelaar Apr 30 '21 at 17:52
Ah, I wasn't thinking about needing the second instruction for comparison. It looks like with the simulation that the difference is negligible. Thanks so much for this and the pointer to godbolt. – Don Hosek Apr 30 '21 at 17:57

u8 as i32 vs i32::from_be_bytes([u8;4])

2 Answers2