Sample Rate Conversion at Runtime

Question

So, i had this brilliant idea to write something to parse .MOD files. (Tracker music) In order to attempt playing them back as a practice exercise to myself for C#.

Now i came all the way to actually playing it back, but ran into an issue. The audio buffer that my engine uses, is at a semi-fixed rate. 44100hz i believe. This particular module file, like i think all .MOD files, has a sample rate of 8287Hz (Something from the atari/amiga age i believe?)

So long story short, it isn't sounding so great.

Then i read up a bit on Sample rate conversion. And some page made a simple comparison. If you have nothing to spare, use nearest neighbour. So i did. It sounds like a dying horse. Like it screeches horribly, would not recommend. So i attempted the next best thing. Linear interpolation. I know when i have to play the next best sample in my original 8.3Khz file, and all the samples i have to fill till then, i interpolate with my last sample in the original file. Or with 0 otherwise i think (which is probably bad, but a bit besides the point)

So now i have that, and it vaguely resembles the real sound (grain of salt), but it still sounds horrible. Have i done something wrong, or does SRC just take a lot more to get something decent out of it? Also i can not hear any real difference between nearest neighbor and linear interpolation, is there a reason for that?

Original: http://puu.sh/tkgWZ.wav

Nearest:http://puu.sh/tkh0L.ogg

Linear Interpolation:http://puu.sh/tkgZ6.ogg

Is it realistic to try this at runtime, instead of pre-processing every sample and then using it?

I have seen advanced libraries covering just this. And i don't need my project to sound anything like those, but is there a decent alternative, or am i simply doing it wrong here? I would really appreciate some audio side insight, as i am not that well known with how audio works exactly.

Any advice is welcome!

Sample Rate Conversion aka "resampling" isn't super-complicated, but you do need a pretty good anti-aliasing (lowpass) filter. A two-pole Butterworth should work if you want to do it with low CPU usage (comparable to the linear interpolation). — Ben Voigt, Jan 12 '17 at 23:11
Also, you want to upsample to the least common multiple of your input and output rates, filter, then decimate. By "upsample" I just mean inserting zeros in between, and decimate is just throwing away some values. If you don't want to mess with a higher frequency for processing, then pretend that your source material is exactly 1/5th the rate of your 44.1k sound system, that'll produce a small frequency shift but no loss of quality. — Ben Voigt, Jan 12 '17 at 23:14
@BenVoigt Do you have any reference or tutorial surrounding this? I would love to also learn the reasoning behind this and what some of these things achieve, why others are better and such. But most of the stuff i found was either really high end in depth, or: You can use this to convert X to Y! (which i do not want, i want to learn how to do it myself) Many thanks for the pointers, i will try to figure it out from here. See what i get. Audio output is just a tad hard to analyze. — Smileynator, Jan 13 '17 at 18:42
I suggest looking at a plot of the signal will teach you more about what's going on than listening to it. This is a good explanation, but it comes without any pictures http://dspguru.com/dsp/faqs/multirate/resampling — Ben Voigt, Jan 13 '17 at 18:52
It does seem to cover a whole lot. It explains what you are trying to tell me, so i think i can work this out. Thanks a lot for the directions! — Smileynator, Jan 13 '17 at 19:03
@BenVoigt I have tried something of an implementation, but it is currently resulting in just a punch of "ticking" sounds in my output. So i am probably doing some of the steps horribly wrong. Would you mind taking a look? http://pastebin.com/DqMLnkzD As a note, LCM is tested and works properly. and i used an example of ButterworthLowpass as given here: https://www.codeproject.com/Tips/1092012/A-Butterworth-Filter-in-Csharp I just made some small edits so the 2nd parameter just takes sampling rate, as it is known to me. Mind you i have not optimized jack. Proof of concept first. — Smileynator, Jan 14 '17 at 10:17
Cutoff frequency should be at or below half of both input and output rate. Since your input rate is lower, use half of that. — Ben Voigt, Jan 15 '17 at 04:21
Also, 8287 is prime, which means LCM will upsample to 44100 * 8287. And the filter cutoff frequency will be 88000 times lower than the filter sample frequency... not good at all. You'll need to use an approximation of the LCM instead, in order to keep the upsampling and downsampling factors reasonable (neither one should be above about 20). — Ben Voigt, Jan 15 '17 at 04:25
In your case, upsample by 16 and downsample by 3 is a very good approximation. — Ben Voigt, Jan 15 '17 at 04:29
@BenVoigt Ah, the cut off frequency had me confused. They way they put it i was not sure what they mean by it. As for the up/down sampling. Purely looking at logic, it would not work the way i do it. Or is the max 20 up/down sampling ratio just there to prevent mayor memory/processing issues? I am also not sure that i grasp the prime issue, but i suppose i can figure out an algorithm to do the math and keep it under the max ratios. Thanks for teaching me! — Smileynator, Jan 15 '17 at 10:00

Sample Rate Conversion at Runtime

0 Answers0