Ok, there are a bunch of questions on this here, and plenty of reading material on google, yet I somehow am not able to figure this out. I want to get the fundamental frequency of a segment of speech. The basic steps are supposed to be:
- take the FFT of a windowed signal
- convert the FFT from rectangular to polar coordinates (so you can get the magnitude)
- discard the phase information
- take the square, then the natural log of each bin of the magnitude
- take another FFT (or some sources say take the inverse fft?)
Here is how I have implemented this in AS3:
var signal:Vector.<Number> = my1024PointSignal; // an audio signal 1024 samples long
var imx:Vector.<Number> = new Vector.<Number>(signal.length); // 1024 point vector to hold imaginary part of fft
hammingWindow(signal); // window it
zeroFill(imx); // fill imx with zeros
FFT(signal, imx); // convert signal into real and imaginary components of fft
toPolar(signal, imx); // convert fft to polar coordinates
// square each bin, and take the log of each bin, discard phase
for (var i:int = 0, l:int = signal.length; i < l; i++) {
signal[i] = Math.log(Math.pow(signal[i], 2));
imx[i] = 0;
}
FFT(signal, imx); // or maybe inverseFFT(signal, imx), i don't know
Now when I do this and end by taking the FFT, when I plot it the bins appear to be in reverse order? I also am seeing a larger peak at the second harmonic than at the fundamental. When I do this and take the inverse FFT, I get an audio signal that looks reflected around N/2, and again the peaks seem to be reversed. The whole thing is also quite noisy. What am I doing wrong?