I am doing a music visualizer program in C++. It gives the frequency spectrum of the audio input. I used Aquila-dsp for getting audio samples, Kiss-fft for doing FFT, and SMFL to play the audio. The input is in (.wav) format. OpenGL is used to plot the graph.
Algorithm Used:
1. *framePointer = 0, N = 10000;*
2. Load audio file and play it using SFML.
3. For *i* = framePointer to --> *framePointer* + *N* < *total_samples_count*
Collect audio samples.
4. Apply Window Function (Hann window)
5. Apply *FFT*
6. Calculate magnitude of first N/2 *FFT* data
*Magnitude* = sqrt( re * re + im * im)
7. Convert to dB(log) scale (optional)
10*log(magnitude)
8. Plot N/2, log(magnitude) values
9. If *framaPointer* >= *toatl_samples_count - N*
Exit
Else go to step 3.
#define N 10000
int framePointer = 0;
void getData()
{
int i,j,x;
Aquila::WaveFile wav(fileName);
double mag[N/2];
double roof = wav.getSamplesCount();
//Get first N samples
for( i = framePointer, j = 0; i < (framePointer + N)
&& framePointer < roof - N ; i++,j++ ){
//Apply window function on the sample
double multiplier = 0.5 * (1 - cos(2*M_PI*j/(N-1)));
in[j].r = multiplier * wav.sample(i);
in[j].i = 0; //stores N samples
}
if(framePointer < roof-N -1){
framePointer = i;
}
else {
printf("Frame pointer > roof - N \n");
printf("Framepointer = %d\n",framePointer );
//get total time and exit
timestamp_t t1 = get_timestamp();
double secs = (t1 - tmain) / 1000000.0L;
std::cout<<"Program exit.\nTotal time: "<<secs<<std::endl;
exit(0);
}
// Apply FFT
getFft(in,out);
// calculate magnitude of first N/2 FFT
for(i = 0; i < N/2; i++ ){
mag[i] = sqrt((out[i].r * out[i].r) + (out[i].i * out[i].i));
graph[i] = log(mag[i]) *10;
}
}
I plot the graph using OpenGL. Full source code
The problem I got is in choosing the frame length (N value).
For a certain length of audio having:
Length: 237191 ms
Sample frequency: 44100 Hz
Channels: 2
Byte rate: 172 kB/s
Bits per sample: 16b
The graph is synchronized with the audio if I choose N = 10000. Or at least it is stopping while the audio ends.
How to chose the N (frame length) such that the audio will be synchronized with the spectrum. The audio is dual channel, will this algorithm work for that?