1

How is audio data down-sampled to 5512 Hz PCM? I have read some articles and the steps involved are decoding the audio to PCM, converting it to Mono and then downsampling it.

For converting to mono, are the channels of each frame averaged to get the mono signal?

Once the mono signal is obtained, how is it down-sampled?

if down-sampled sample rate = sample rate divided by integer factor, how is this applied to the audio data in the mono signal?

some_id
  • 29,466
  • 62
  • 182
  • 304
  • it is not necessary (or desirable) to convert to mono if all you want to do is to perform sample-rate conversion. If you WANT to convert to mono, that is a separate, unrelated step. – Bjorn Roche Aug 03 '13 at 01:45
  • A technique for downsampling (and converting to mono) has been discussed here: http://stackoverflow.com/questions/15087668/how-to-convert-pcm-samples-in-byte-array-as-floating-point-numbers-in-the-range/15094612#15094612 – Bjorn Roche Aug 03 '13 at 01:48
  • Need to convert to mono, trying to write/experiment with an audio fingerprinter. – some_id Aug 13 '13 at 14:01

1 Answers1

3

Downsampling can be done in two steps: low-pass filtering and interpolation. If you don't want audible artifacts, the low-pass filter has to be very high quality to remove potential aliasing above the new Fs/2, without distorting the passband remainder below. Both the low-pass filter and the interpolator can be combined into a single step by using a FIR filter, with a multi-phase or continuous kernel similar to or the same as a windowed Sinc function.

When downsampling 44100 by exactly 8X, the interpolation step becomes trivial, just use a very high quality low-pass filter before dropping samples to decimate.

hotpaw2
  • 70,107
  • 14
  • 90
  • 153