2

I've implemented a simple autocorrelation routine against some audio samples at a rate of 44100.0 with a block size of 2048.

The general formula I am following looks like this:

r[k] = a[k] * b[k] = ∑ a[n] • b[n + k]

and I've implemented it in a brute-force nested loop as follows:

for k = 0 to N-1 do 
    for n = 0 to N-1 do
        if (n+k) < N 
            then r[k] := r[k] + a(n)a(n+k)
    else
        break;
    end for n; 
end for k;

I look for the max magnitude in r and determine how many samples away it is and calculate the frequency.

To help temper the tuner's results, I am using a circular buffer and returning the median each time.

The brute force calculations are a bit slow - is there a well-known, faster way to do them?

Sometimes, the tuner just isn't quite as accurate as is needed. What type of heuristics can I apply here to help refine the results?

Sometimes the OCTAVE is incorrect - is there a way to hone in on the correct octave a bit more accurately?

TylerH
  • 20,799
  • 66
  • 75
  • 101
Luther Baker
  • 7,236
  • 13
  • 46
  • 64

3 Answers3

4

The efficient way to do autocorrelation is with an FFT:

  • FFT the time domain signal
  • convert complex FFT output to magnitude and zero phase (i.e. power spectrum)
  • take inverse FFT

This works because autocorrelation in the time domain is equivalent to power spectrum in the frequency domain.

Having said that, bare bones autocorrelation is not a great way to implement (accurate) pitch detection in general, so you might want to have a rethink about your whole approach.

Paul R
  • 208,748
  • 37
  • 389
  • 560
0

I don't fully understand the question, but I can point out one trick that you might be able to use. You say you look for the sample that is the max magnitude. If it is useful in the rest of your calculations, you can calculate that sample number to sub-sample precision.

Let's say you have a peak of 0.9 at sample 5 and neighboring samples of 0.1 and 0.8. The actual peak is probably somewhere between sample 5 and sample 6.

(0.1 * 4 + 0.9 * 5 + 0.8 * 6) / (0.1 + 0.9 + 0.8) = 5.39
Fantius
  • 3,806
  • 5
  • 32
  • 49
  • I'm not sure I can use that in my autocorrelation algorithm. Autocorrelation isn't looking for the max magnitude of one sample. – Luther Baker Sep 20 '11 at 22:13
0

One simple way to improve this "brute force" autocorrelation method is to limit the range of k and only search for lags (or pitch periods) near the previous average period, say within +-0.5 semitones at first. If you don't find a correlation, then search a slightly wider range, say, a within a major third, then search a wider range but within the expected frequency range of the instrument being tuned.

You can get higher frequency resolution by using a higher sample rate (say, upsampling the data before the autocorrelation if necessary, and with proper filtering).

You will get autocorrelation peaks for the pitch lag (period) and for multiples of that lag. You will have to eliminate those subharmonics somehow (maybe as impossible for the instrument, or perhaps as an unlikely pitch jump from the previous frequency estimations.)

hotpaw2
  • 70,107
  • 14
  • 90
  • 153
  • Regarding the phrase "find a correlation", I'm generally looking for the "max" - and that is it, I don't actually compare or test whatever the max summation is. Using the modification you suggest, would I return the resulting "max" > 0 ... or do I need some logic to know that I went from a bad to a valid correlation. – Luther Baker Sep 22 '11 at 22:55
  • For instance, would I want to make sure I get a summation < 0 - and then look for the max up to the next value < 0 so I know that? Is that what would help me decide if I need to try against a wider range? I guess, fundamentally, what makes a correlation valid or not - if I'm simply looking for the max value from 'k' to 'n' where k is my 5 semitones less than my previous freq? – Luther Baker Sep 22 '11 at 22:56