I'm creating a voice training app and I've used FFT to transform the signal from time domain to frequency domain. Prior to applying FFT I've windowed the signal using blackman-harris window. Then I used harmonic product spectrum to extract the fundamental frequency. The lowest frequency is F2 (87.307 Hz) and the highest is C6 (1046.502 Hz). FFT Length is 8192 and the sampling frequency is 44100 Hz.
To fix the octave errors, I applied the rule mentioned here by;
float[] array = hps.HPS(Data);
float hpsmax_mag = float.MinValue;
float hpsmax_index = -1;
for (int i = 0; i < array.Length; i++)
if (array[i] > hpsmax_mag)
{
hpsmax_mag = array[i];
hpsmax_index = i;
}
// Fixing octave too high errors
int correctMaxBin = 1;
int maxsearch = (int) hpsmax_index * 3 / 4;
for (int j = 2; j < maxsearch; j++)
{
if (array[j] > array[correctMaxBin])
{
correctMaxBin = j;
}
}
if (Math.Abs(correctMaxBin * 2 - hpsmax_index) < 4)
{
if (array[correctMaxBin] / array[(int)hpsmax_index] > 0.2)
{
hpsmax_index = correctMaxBin;
}
}
I tested the system using sawtooth waves and I noticed that the octave errors are still visible. 87.307 Hz to ~190 Hz it gives octave high errors. G5 (783.991) upwards sometimes it shows an octave lower.
Here are some of the results: Input | Result | Error
F2 (87.307) - F4 (349.228) - 2 octaves higher
G2 (97.999)- G4 (391.995) - 2 octaves higher
A2 (110) - A3 (220) - an octave higher
D3 (146.832) - D4 (mostly) (293.665) and D3 - an octave higher
A3 (220) - A3 - Correct
A4 (440) - A4 - Correct
G5 (783.991) - G5 (mostly) and G4 (391.995) - an octave lower
A5 (880) - A5 - Correct
C6 (1046.502) - C6 - Correct
Please help me to fix this, because this affects so badly to the system's final feedback to the user.