13

I'm currently trying to decipher WAV files. From headers to the PCM data.

I've found a PDF (http://www.tdt.com/T2Support/technical_notes/tn0132.pdf) detailing the anatomy of a WAV file, and I've been able to extract and make sense of the appropriate header data using Ghex2. But my questions are:

Why are the integers bytes stored backwards? I.e. dec. 20 is stored as 0x14000000 instead of 0x00000014.

Are the integers of the PCM data also stored backwards?

Z3t
  • 133
  • 1
  • 4

2 Answers2

14

WAV files are little-endian (least significant bytes first) because the format originated for operating systems running on intel processor based machines which use the little endian format to store numbers.

If you think about it kind of makes sense because if you want to cast a long integer to a short one or even a character the starting address remains the same you just look at less bytes.

Consequently, for 16 bit encoding upwards, little-endian format will be used for the PCM as well. This is quite handy since you will be able to pull them in as integers. don't forget they will be stored as two's complement signed integers if they are 16 bit, but not if they are 8 bit. (see http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/WAVE.html for more detail)

Scott Stensland
  • 26,870
  • 12
  • 93
  • 104
FixerMark
  • 739
  • 6
  • 15
  • It's weird that the mentioned page states that number should be stored in big-endian form (but the format actually does use little-endian). – vgru Oct 31 '13 at 12:17
  • The mentioned page no longer states anything as it has vanished... (I've found that happens rather often with college web site links, after a few years.) – Peter Hansen Feb 09 '15 at 20:22
  • I'm not sure if my file doesn't comply or what, but all of the number fields (sample rate, bit rate, etc) are stored in little endian, while all the word fields (RIFF, WAVE, fmt , etc) are stored in big endian. – MarcusJ Jun 25 '15 at 14:08
3

"Backwards" is subjective. Some machines are big-endian, others are little-endian. In byte-oriented contexts like file formats and network protocols, the order is arbitrary. Some formats like to specify big- or little-endian, others like to be flexible and accept either form, with a flag indicating which is in use.

Looks like WAV files just like little-endian.

Ned Batchelder
  • 364,293
  • 75
  • 561
  • 662