How does librosa estimate tempo?

Question

I've inputted artifically made music with 120 bpm into:

y, sr = librosa.load(sys.argv[1])
tempo, beats = librosa.beat.beat_track(y,sr)
print("Tempo 1:", tempo)
first_beat_time, last_beat_time = librosa.frames_to_time((beats[0],beats[-1]),sr=sr)
print("Tempo 2:", 60/((last_beat_time-first_beat_time)/(len(beats)-1)))

With the output:

Tempo 1: 117.45383522727273
Tempo 2: 120.03683283914009

Shouldn't those numbers be the same, and almost equal to 120?

Honestly I'm not sure. I didn't read linked paper yet but even if it explains why librosa tempo is different from average beat length, I'm still confused why librosa tempo is so far from real song bpm? — ingwarus, May 10 '21 at 14:28
117.45 and 120.03 look pretty close to me - don't they? Or maybe I am nor getting something. — Lukasz Tracewski, May 10 '21 at 15:55
120.0368 is close enought to original 120.000 but 117.45 is more that 2% missed. It's big error. I could guess tempo more accurate counting beats with clock in my hand. And don't forget, that 120.0368 is calculated from beats from librosa, so it should be able to calculate tempo at least that good. — ingwarus, May 12 '21 at 08:55
How did you generate that signal? It's impossible to answer your question with more precision without the full code. — Lukasz Tracewski, May 12 '21 at 09:28
My friend made audio .wav file by some music software with specified BPM. I've checked that BPM with other tools and it really is 120BPM. Beats are clear and correctly recognized and returned by librosa (I've checked all of them manually). Just tempo returned by librosa is wrong. For me it looks like a bug, but maybe I don't understand something. — ingwarus, May 12 '21 at 09:37

score 2 · Answer 1 · answered May 08 '21 at 09:43

The algorithm is described in detail in the Beat Tracking by Dynamic Programming paper, as cited in the librosa docs. In essence:

Estimate a global tempo.
Use this tempo to construct a transition cost function.
Use dynamic programming to find the best-scoring set of beat times that reflect the tempo as well as corresponding to moments of high "onset strength" in a function derived from the audio.

The algorithm is deterministic, but in order to get precisely the same result, you'd need to make sure that exactly the same frames fit in a processing window (which they don't in your case).

How does librosa estimate tempo?

1 Answers1