What's wrong with this 2-bytes-to-int conversion?

Question

I'm trying to parse a JPEG file. This page says that the format is the following :

0xFF+Marker Number(1 byte)+Data size(2 bytes)+Data(n bytes)

So, when I encounter a 0xFF, I read the data like this (s is the JPEG file stream) :

int marker, size;
byte[] data;
//marker number (1 byte)
marker = s.ReadByte();
//size (2 bytes)
byte[] b = new byte[2];
s.Read(b, 0, 2);
size = BitConverter.ToInt16(b, 0);

Problem is, size's value after that is -7937 (which causes the next lines to raise an exception because I try to allow a -7937-long byte[]). b[0] == 255 and b[1] == 224.

I suspect I don't use BitConverter.ToInt16 properly, but I can't find what I did wrong.

The BitConverter doc page says that "The order of bytes in the array must reflect the endianness of the computer system's architecture", but when I do this :

byte a = b[0]; b[0] = b[1]; b[1] = a;
size = BitConverter.ToInt16(b, 0);

...I get size == -32 which is not really better.

What's the problem ?

Perhaps it's an unsigned short, so you should use `BitConverter.ToUInt16()` — Matthew Watson, Aug 09 '16 at 11:09
The bits are obviously `0xFF70` or `0x70FF`, both have the most significant bit set, so Matthew seems right, it should be an unsigned word. Use `uint size = BitConverter.ToUInt16()`. — René Vogt, Aug 09 '16 at 11:14
have a look at http://stackoverflow.com/a/8227753/6007877 I think you have forgotten about the 0xE1 — Git, Aug 09 '16 at 11:20
Don't use `BitConverter` when parsing protocols, because you want to keep the endianness work the same regardless of the architecture. [This answer](http://stackoverflow.com/a/7190266/69809) says that, while Jpeg data is **big endian** (meaning `BitConverter` on x86 *won't* work), the header can be encoded in both ways. [This page](http://www.fileformat.info/format/jpeg/corion.htm) also suggests some hints on how to detect header endianness. — vgru, Aug 09 '16 at 11:59
Thanks all, Matthew was right : using uint, I get a least some UTF-8-encoded fields. Why post a comment and not an answer ? Also, gismo's link is much clearer than the page I linked in the question. — Hey, Aug 09 '16 at 12:20
@Groo what do you suggest ? This is the first time I try to parse a binary format, and I don't really know how to do it properly. Is there another built-in function that takes the endianness as an argument ? — Hey, Aug 09 '16 at 12:26
You can check out Jon Skeet's [miscutil](http://www.yoda.arachsys.com/csharp/miscutil/) (a bit dated, last version seems to be from 2009), which contains both big and little endian converters. Or, you can write them yourself (it's basically `(a[i] << 8) | a[i + 1]` vs `a[i] | (a[i + 1] << 8)`, but you will need to implement it for 16, 32 and 64 bits). And of course keep the sign in mind, `ToInt16` is not the same as `ToUInt16`. — vgru, Aug 09 '16 at 12:35

score 1 · Answer 1 · answered Aug 10 '16 at 02:18

1

Integers are stored in Big Endian order in JPEG. If you are on a little endian system (e.g. Intel) you need to reverse the order of the bytes in the length field. Length fields are unsigned.

answered Aug 10 '16 at 02:18

user3344003

20,574
3
26
62

Thank you, but I already tried to reverse the two bytes (see the end of my question). The unsigned int was the solution. – Hey Aug 10 '16 at 08:29

score 1 · Accepted Answer · answered Jan 28 '17 at 18:04

1

The data in question was an unsigned int. Using the uint type and BitConverter.ToUInt16 fixed it.

answered Jan 28 '17 at 18:04

Hey

1,701
5
23
43

What's wrong with this 2-bytes-to-int conversion?

2 Answers2