Finding peaks in audio spectrogram

Question

Introduction : I am working on audio fingerprinting and having some doubts regarding peak detection in the spectrogram, my input is a wav file with spectrogram as : The method I'm implementing is given here

Problem : The peaks returning from the get_2Dpeaks() method are not overlapping with the above spectrogram. On different files the plot changes drastically. for the above file used, output of get_2D_peaks() is :

It seems like I'm not fitting the signal data well, does anyone have any ideas or thoughts on how can I find the peaks in the spectrogram and also plot them ?

score 0 · Answer 1 · answered Apr 11 '23 at 13:59

Audio fingerprinting using spectrogram peaks is discussed in Fundamentals of Music Processing: Audio Identification.

It provides code for finding the peaks of a magnitude spectrogram, based on scipy.ndimage. This code is provided below:

def compute_constellation_map(Y, dist_freq=7, dist_time=7, thresh=0.01):
    """Compute constellation map (implementation using image processing)

    Notebook: C7/C7S1_AudioIdentification.ipynb

    Args:
        Y (np.ndarray): Spectrogram (magnitude)
        dist_freq (int): Neighborhood parameter for frequency direction (kappa) (Default value = 7)
        dist_time (int): Neighborhood parameter for time direction (tau) (Default value = 7)
        thresh (float): Threshold parameter for minimal peak magnitude (Default value = 0.01)

    Returns:
        Cmap (np.ndarray): Boolean mask for peak structure (same size as Y)
    """
    result = ndimage.maximum_filter(Y, size=[2*dist_freq+1, 2*dist_time+1], mode='constant')
    Cmap = np.logical_and(Y == result, result > thresh)
    return Cmap

Finding peaks in audio spectrogram

1 Answers1