1

I'm implementing the pitch shifting method described in Nicolas Juillerat & Beat Hirsbrunner's 2010 paper "Low Latency Audio Pitch Shifting in the Frequency Domain". I've got most of the algorithm implemented so far (here's the code if you're curious, but it shouldn't matter for this question).

I'm stuck on the last step of Section 3.5: Handling the Modulation Effect. Applying a von Hann window to the analysis and synthesis stages was simple enough, but it seems like the paper is missing some details on how to calculate this curve which I'm supposed to divide by:

Second, the cycle of altered analysis windows is computed for the current scaling ratio, overlap factor, analysis window and synthesis window; and the resulting amplitude modulation curve is calculated. After the inverse DFT and overlap-add process, the resulting timedomain samples are divided by the computed amplitude modulation curve, in order to “demodulate” the result.

The paper provides some example images, but I'm unable to figure out how these curves should be calculated. (This operation should fit in around line 119 of the gist I linked above.) The algorithm currently sounds worse than the standard phase vocoder approach at low latencies, so it looks like this demodulation step is crucial to the quality of the algorithm.

I don't know the math behind where this amplitude modulation comes from, so I'm not really sure where I would even start to figure out how to calculate the curve. I could put some sine waves through the algorithm and see what comes out, but that information is basically already provided by the images and doesn't help me figure out an actual formula.

So, does anyone know how how I might calculate the amplitude modulation curve for this algorithm?

jconst
  • 31
  • 4

0 Answers0