Analysing audio data for attributes at time intervals

Question

I've been wanting to play around with audio parsing for a while now but I haven't really been able to find the correct library for what I want to do.

I basically just want to parse through a sound file and get amplitudes/frequencies and other relevant information at certain times during the song (like every 10 ms or so) so I can graph the data for example where the song speeds up a lot and where it gets really loud.

I've looked at OpenAL quite a bit but it doesn't look like it provides this ability, other than that I have not had much luck with finding out where to start. If anyone has done this or used a library which can do this a point in the right direction would be greatly appreciated. Thanks!

Heads up: this question is going to end in a lecture about signal processing. I won't give it, I'm not qualified. — Chris Eberle, Sep 09 '11 at 04:29
Your asking about turning a digital (quantized) audio source into something meaningful (like finding where a song speeds up), and that's an entire field of study unto itself. — Chris Eberle, Sep 09 '11 at 04:37
@Chris Ah, okay I think I see where this is going. So it is for the most part parsing a binary data stream representing the Hz values or something to find implied data like frequency and amplitude? — bobl, Sep 09 '11 at 04:39
Not quite. A PCM wave file represents 65,535 distinct values (which map to voltages), at 44,100 of these values are measured each second. Reconstructing the original value requires a fast fourier transform. — Chris Eberle, Sep 09 '11 at 04:50
http://audacity.sourceforge.net/manual-1.2/tutorial_basics_1.html — Chris Eberle, Sep 09 '11 at 04:51
http://www.intmath.com/fourier-series/7-fast-fourier-transform-fft.php — Chris Eberle, Sep 09 '11 at 04:51
Note that there is now a **signal processing** StackExchange site where this question would probably be better asked: http://dsp.stackexchange.com — Paul R, Sep 09 '11 at 06:08

score 1 · Accepted Answer · answered Sep 09 '11 at 05:12

For parsing and decoding audio files I had good results with libsndfile, which runs on Windows/OSX/Linux and is open source (LGPL license). This library does not support mp3 (the author wants to avoid licensing issues), but it does support FLAC and Ogg/Vorbis.

If working with closed source libraries is not a problem for you, then an interesting option could be the Quicktime SDK from Apple. This SDK is available for OSX and Windows and is free for registered developers (you can register as an Apple developer for free as well). With the QT SDK you can parse all the file formats that the Quicktime Player supports, and that includes .mp3. The SDK gives you access to all the codecs installed by QuickTime, so you can read .mp3 files and have them decoded to PCM on the fly. Note that to use this SDK you have to have the free QuickTime Player installed.

As far as signal processing libraries I honestly can't recommend any, as I have written my own functions (for speech recognition, in case you are curious). There are a few open source projects that seem interesting listed in this page.

I recommend that you start simple, for example working on analyzing amplitude data, which is readily available from the PCM samples without having to do any processing. Being able to visualize the data is very useful, I have found Audacity to be an excellent visualization tool, and since it is open source you can build your own tests inside it.

Good luck!

Analysing audio data for attributes at time intervals

1 Answers1