4

I'm writing a parser for the most common geographic data storage type, a collection of files called a "shapefile". This is my first project where I've had to think about endianness.

It turns out that the geometry storage is mixed endian; some parts of the file are big endian, but most of it is little endian. The shapefile standard is described here.

Is there a discernible performance rationale, or was it simply born out of historical context? If so, do you happen to know what that historical context is?

The integers and double-precision integers that make up the data description fields in the file header (identified below) and record contents in the main file are in little endian (PC or Intel®) byte order. The integers and double-precision floating point numbers that make up the rest of the file and file management are in big endian (Sun® or Motorola®) byte order.

Tomasz Nurkiewicz
  • 334,321
  • 69
  • 703
  • 674
canisrufus
  • 665
  • 2
  • 6
  • 19
  • Git seems to do the same thing with pack files, as indicated in [this](https://codewords.recurse.com/issues/three/unpacking-git-packfiles) page – jrtapsell Feb 22 '18 at 22:49

1 Answers1

1

While there doesn't seem to be a clear answer for it, what I've seen is a mixture of "confusion while trying to create a format that works on all platforms" and "a lot of poorly designed formats were designed back then". More info here: https://gis.stackexchange.com/questions/18969/oddities-in-the-shapefile-technical-specification

Community
  • 1
  • 1
avlund
  • 252
  • 1
  • 4
  • Haha. Thanks for the time delayed answer. Note the user who asked the question over there ;) – canisrufus May 23 '14 at 15:19
  • 1
    Haha, didn't notice that at all. In that case, I don't feel so bad about this thread not getting an answer until now. :P – avlund May 24 '14 at 21:36