How can a chromagram file produced by librosa be interpreted as a set of musical keys?

Question

I have a chroma features file here. How can these numbers be interpreted as belonging to different musical keys? I need to use the key found at a particular time code to produce a solution similar to this in order to mix between two tracks. How can these numbers be interpreted as an overall key being played and how can I skip to a particular time code to get the given key?

I've tried getting the chroma as described here, but the output is just in numbers rather than musical notes. I need to interpret the music at a particular time code to belong to a singular key being played.

To estimate a key using chroma features you could use the [Krumhansl-Schmuckler key-finding algorithm](http://rnhart.net/articles/key-finding/). In essence you are averaging chroma features over some time (tonal key is not instantaneous! you cannot determine the key for just one timeframe) and then try to find a pre-computed key profile that correlates best with your averaged chroma vector. Note that modern approaches use CNNs. See for example [here](https://www.music-ir.org/mirex/abstracts/2017/HS1.pdf) and [here](https://arxiv.org/pdf/1808.05340.pdf). — Hendrik, Jul 18 '19 at 06:31

Jon Nordby · Answer 1 · 2019-07-25T21:06:12.150

librosa.feature.chroma_stft returns a chroma spectrogram (or chromogram). It has shape (12, n_frames). 12 for each of the 12 semitones in an octave C,C#,D..., B. Each bin in the chroma spectrogram represents the average energy of that semitone (across all octaves).

n_frames is the number of time-frames in the spectrogram. Each frame is hop_length/sr seconds long, with sr is the sample rate of the loaded audio file (possibly resampled). So to go to a given time in seconds in this spectrogram, compute frame_no = int(time / (hop_length/sr)).

Musical key

To go from chroma spectrogram to musical key (this music in A minor, or F major) can be done with supervised machine learning. A classifier would be trained on short time-windows of the chroma spectrogram (say 1-10 seconds) in order to classify the tonic (12 classes, C-B), and the mode (minor, major).

For an example, see the paper Detecting Musical Key with Supervised Learning by Robert Mahieu (2016).

A sample length of just 1 second is probably enough to figure out a chord, but not always enough to figure out the key of a piece of music. For mixing between two tracks that may or may not be enough depending on the complexity of the music. — , Jul 25 '19 at 20:09
Yeah, windows might have to be longer. It is also possible to merge results from multiple analysis windows. — Jon Nordby, Jul 25 '19 at 21:07

score 1 · Answer 2 · 2019-08-29T22:25:28.063

There are certainly lots of methods that can be applied. A really simple method that will work for lots of popular songs is to look for patters in the chroma that match a key. We can find the most common note, assume it is the root, and then check if the third is major or minor by seeing which of those has a higher value as well. In this case I used the entire song, which is about 2 minutes long, to get the chorma. Given you can already get the numbers out, I'll start from there.

# the chroma_cqt for And Your Bird Can Sing
song_chroma = [0.31807876,
 0.27062345,
 0.2786934,
 0.49264827,
 0.6221079,
 0.47696424,
 0.38320214,
 0.3663701,
 0.4019624,
 0.34131885,
 0.35606056,
 0.411583]

# pitches in 12 tone equal temperament 
pitches = ['C','C#','D','D#','E','F','F#','G','G#','A','A#','B']

# print note to value relations
for y in range(len(song_chroma)):
    print(str(pitches[y]) + '\t' + str(song_chroma[y]))

# select the most dominate pitch
pitch_id = song_chroma.index(max(song_chroma))
pitch = pitches[pitch_id]

min_third_id = (pitch_id+3)%12
maj_third_id = (pitch_id+4)%12

#check if the musical 3rd is major or minor
if song_chroma[min_third_id] < song_chroma[maj_third_id]:
    third = 'major'
    print(str.format('\nThis song is likely in {} {}',pitch, third))
elif song_chroma[min_third_id] > song_chroma[maj_third_id]:
    third = 'minor'
    print(str.format('\nThis song is likely in {} {}',pitch, third))
else:
    print(str.format('\nThis song might be in {} something???',pitch))

Output:

C       0.31807876
C#      0.27062345
D       0.2786934
D#      0.49264827
E       0.6221079
F       0.47696424
F#      0.38320214
G       0.3663701
G#      0.4019624
A       0.34131885
A#      0.35606056
B       0.411583

This song is likely in E major

There are more complicated rules based approaches that could be taken, such as looking at how strong is the 4th, the 5th, the minor fall or the major lift... or what's the 7th doing, and on into all sorts of complicated and fun music theory stuff. As noted in jonnor's answer there are ML/DL approaches too, and hendrik's comment also has good info on more sophisticated models. And yes, this song is in E-major according to Alan Pollack's Notes On

How can a chromagram file produced by librosa be interpreted as a set of musical keys?

2 Answers2

Musical key