I have a 16 bit, 48kHz, 1-channel (mono) PCM audio file (with no header but it would be the same with a WAV header anyway) and I can read that file correctly using a software such as Audacity, however when I try to read it programatically (in C++), some samples just seem to be out of place while most are correct when comparing Audacity values.
My process of reading the PCM file is the following:
- Convert the byte array of PCM to a short array to get readable values by bitshifting (the order of bytes is little-endian here).
for(int i = 0; i < bytesSize - 1; i += 2)
shortValue[i] = bytes[i] | bytes[i + 1] << 8;
note: bytes
is a char
array of the binary contents of the PCM file. And shortValue
is a short
array.
- Convert the short values to Amplitude levels in a float array by dividing by the max value of short (32767)
for(int i = 0; i < shortsSize ; i++)
amplitude[i] = static_cast<float>(shortValue[i]) / 32767;
This is obviously not optimal code and I could do it in one loop but for the sole purpose of explaining I separated the two steps.
So what happens exactly is that when I try to find very big changes of amplitude levels in my last array, it shows me samples that are not correct? Like here in Audacity notice how the wave is perfectly smooth and how the sample 276,467 pointed in green goes just a bit lower to the next sample pointed in red, which should be around -0.17.
However, when reading from my code, I get a totally wrong value of the red sample (-0.002), while still getting a good value of the green sample (around -0.17), the sample after the red one is also correct (around -0.17 as well).
I don't really understand what's happening and how Audacity is able to read those bytes correctly, I tried with multiple PCM/WAV files and I get the same results. Any help would really be appreciated!