STFT / sliding FFT on real-time data

Question

I recently picked up a project where I need to perform a real-time sliding FFT analysis on incoming microphone data. The environment I picked to do this in, is OpenGL and Cinder and using C++.

This is my first experience in audio programming and I am a little bit confused.

This is what I am trying to achieve in my OpenGL application:

enter image description here

So in every frame, there's a part of the incoming data. In a for-loop (therefore multiple passes) a window of the present data will be consumed and FFT analysis will be performed on it. For next iteration of the for-loop, window will advance "hop-size" through the data and etc. until the end of the data is reached.

Now this process must be contiguous. But as you can see in the figure above, as soon as my current app frame ends and when next frame's data comes in, I can't pick up where I left the previous frame (because data is already gone). You can see it in figure where the blue area is in-between two frames.

Now you may say, pick the window-size / hop-size in a way that this never happens but that is impossible since these parameters should left user-configurable in my project.

Suggestions for this kind of processing, oriented towards C++11 is also very welcomed!

Thanks!

I don't understand this wording: *"as soon as the number of N samples are finished and the current buffer's processing is done, the next incoming buffer's first windowed chunk will have a gap unrelated to k with the last windowed chunk of data from previous buffer"* - perhaps you can draw us some ASCII diagrams for that part? — John Zwinck, Feb 09 '15 at 05:17
@JohnZwinck I included an image that hopefully pictures what I am trying to achieve. Thanks! — Sepehr, Feb 09 '15 at 19:39
You show pass 1,2,3,4,5 all in Frame 1. I think you should rethink this and what you call a "pass" should be called a "frame". Using a hop size of 0.25*framelength should get you what you want. — lmat - Reinstate Monica, Nov 19 '18 at 17:42

score 1 · Answer 1 · answered Feb 09 '15 at 05:33

Not sure I understand your scenario 100%, but sounds like you may want to use a circular buffer. There is no "standard" circular buffer, but there's one in Boost.

However, you'd need a lock if you plan to do the processing with 2 threads. One thread, for example, would wait on the audio input, then take the buffer lock, and copy from the audio buffer to the circular buffer. The second thread would periodically take the buffer lock and read the next k elements, if there are at least k available in the buffer...

You'd need to adjust the size of the buffer appropriately and make sure you always handle the data faster than the incoming rate to avoid losses in the circular buffer...

Not sure why you mention that the buffer is lock-free and whether that is a requirement, I'd try the circular buffer with locks first as it seems simpler conceptually, and only go lock-free if you have to, because the data structure could be more complicated in this case (but maybe a "producer-consumer" lock-free queue would work)...

HTH.

There's a [RingBuffer class](https://github.com/cinder/Cinder/blob/master/include/cinder/audio/dsp/RingBuffer.h) within libcinder for this very purpose (catered towards audio processing). — rich.e, Dec 11 '15 at 04:34

score 1 · Answer 2 · answered Feb 10 '15 at 00:20

Thanks for posting a graphic--that illustrates the problem nicely.

All you really need here is a buffer of size (window - 1) where you can store zero or more samples from the "previous" frame for processing in the "next" one. In C++ this would be:

std::vector<Sample> interframeBuffer;
interframeBuffer.reserve(windowSize - 1);

Then when you are within windowSize samples from the end of the current frame, rather than process the samples you store them with interframeBuffer.push_back(sample). When you start processing the next frame, you first do:

for (const Sample& sample : interframeBuffer) {
    process(sample);
}
interframeBuffer.clear();

You should use a single vector the whole time, clearing it and repopulating it as needed, to avoid memory allocation. That's why we call reserve() at the top--to avoid latency later on. Calling clear() doesn't release the memory, it just resets the size() to zero.

STFT / sliding FFT on real-time data

2 Answers2