1

my task is to read metadata values from a unsigned char array, which contains the bytes of a binary .shp file (Shapefile)

unsigned char* bytes;

The header of the file which is stored in the array and the order of the information stored in it looks like this:

int32_t filecode // BigEndian
int32_t skip[5] // Uninteresting stuff
int32_t filelength // BigEndian
int32_t version // LitteEndian
int32_t shapetype // LitteEndian
// Rest of the header and of the filecontent which I don't need

So my question would be how can I extract this information (except the skip part of course) under consideration of the endianness and read it into the according variables.

I thought about using ifstream, but I couldnt figure out how to do it properly.

Example:

Read the first four bytes of the binary, ensure big endian byte order, store it in a int32_t. Then skip 5* 4 Bytes (5 * int32). Then read four bytes, ensure big endian byte order, and store it in a int32_t. Then read four bytes, ensure little endian byte order, and again store it in a int32_t and so on.

Thanks for your help guys!

Samuel Dressel
  • 1,181
  • 2
  • 13
  • 27
  • Reading from a file in standard C++ uses `ifstream` and for binary reads use `istream::read`. What dd you try and how did it fail? – john May 14 '20 at 18:16
  • But I don't understand why you say in one sentence the input is an unsigned char array, and the next sentence that it's a binary file. Which is it? – john May 14 '20 at 18:18
  • `int32_t` is four bytes. To construct an `int32_t` value from a byte stream, you need to "read" (extract) four bytes from the stream. and combine them (using bitwise shifts and or operations are common way to do it). The endianness issue is harder, unless you know exactly the endianness of the bytes coming in the stream. – Some programmer dude May 14 '20 at 18:19
  • What did you try? – Costantino Grana May 14 '20 at 18:25
  • @john I'm currently programming a WASM-Plugin for a Rust Project. So yes, I'm reading a .shp file, but for the plugin I get the bytes of the file as unsigned char array. So basiclly for my task I have to read from a unsigned char array, in which the bytes of the file are stored. – Samuel Dressel May 15 '20 at 11:42
  • @Someprogrammerdude I know how the byte order is in the byte stream. (See the comments of my sample code). To get the variables correclty, I have to use the right endianness and change it accordingly. – Samuel Dressel May 15 '20 at 11:46
  • I tried using ifstream/istream but an unsigned char is one byte. So I couldnt figure out how to use ifstream to for example read the first four bytes which represents the int32_t for the filecode, make sure its BigEndian and store it in a int32_t – Samuel Dressel May 15 '20 at 11:49

1 Answers1

1

So 'reading' a byte array just means extracting the bytes from the positions in the byte array where you know your data is stored. Then you just need to do the appropriate bit manipulations to handle the endianess. So for example, filecode would be this

filecode = (bytes[0] << 24) | (bytes[1] << 16) | (bytes[2] << 8) | bytes[3];

and version would be this

version = bytes[13] | (bytes[14] << 8) | (bytes[15] << 16) | (bytes[16] << 24);

(An offset of 13 for the version seems a bit odd, I'm just going on what you stated above).

john
  • 85,011
  • 4
  • 57
  • 81