Audio Fingerprint matching - finding closest matches

Question

I am getting audio fingerprints from sound clips, using fpcalc. They look like this:

AQAAE9GSKVOkLEOy5PlQE0d9fId7HD-aHD_xhMeRrKORLseX44etHD8AYcAgSrEjDKFAsIGIFAJZ

AQAAE1M9RUkW1NGFH0d4HcnyJIlw4UW17HiyPMHt4B18EX2go9qJTz_eJzgBgBg4CphigUCMGCWFAcAw

AQAAAA

Now I record a sound and fingerprint it, it might look like this:

AQAAE5ISLVOkTEF-QfURpkGZHHeeIpehB3HMoRKaikbTKHvQNnlwpIdOxNHHY_IPJttlAECEI8BBAAgFAiigAA

Now Im looking at my database to find the closest match using levenshtein distance like this:

def levenshtein_distance(first, second):
    """Find the Levenshtein distance between two strings."""
    if len(first) > len(second):
        first, second = second, first
    if len(second) == 0:
        return len(first)
    first_length = len(first) + 1
    second_length = len(second) + 1
    distance_matrix = [[0] * second_length for x in range(first_length)]
    for i in range(first_length):
       distance_matrix[i][0] = i
    for j in range(second_length):
       distance_matrix[0][j]=j
    for i in xrange(1, first_length):
        for j in range(1, second_length):
            deletion = distance_matrix[i-1][j] + 1
            insertion = distance_matrix[i][j-1] + 1
            substitution = distance_matrix[i-1][j-1]
            if first[i-1] != second[j-1]:
                substitution += 1
            distance_matrix[i][j] = min(insertion, deletion, substitution)
    return distance_matrix[first_length-1][second_length-1]

Im not getting good results, as the sounds does not match well with the samples I give it.

Am I doing this correctly? Are there better fingerprinting libraries out there? Im using python or ruby..

Im trying to match a wistle to a bird call.

score 2 · Answer 1 · answered Aug 24 '13 at 13:10

First, you should not compare the code strings directly. I do not know which algorithm pfcalc is based on but it is very likely it measures some audio features (such as energy, mfcc ... as mentionned aboved) on each frame of your audio input. These features may be integer values which are then converted as string (or base64 string). So comparing the values of these strings does not make any sense (except if you are trying to identify identical audio content).

I do not sure I understand well what you are trying to do "Im trying to match a wistle to a bird call", but I think what you are to do wont be resolved using audio fingerinting since it is designed to recognized "almost similar" audio contents.

Shane Davies · Answer 2 · 2015-12-15T06:59:58.223

2

Run fpcalc with the -raw option to give you the 32bit integers you need to compare.

./fpcalc -raw audio.wav

For a very easy comparison, convert each fingerprint to 20 bits:

Python example

fps_20 = [x >> 12 for x in fps]

and count the difference.

edited Dec 15 '15 at 06:59

answered Oct 13 '14 at 08:57

Shane Davies

940
10
10

score 1 · Answer 3 · answered Jul 30 '13 at 13:44

Methods of fingerprint does not work well for what you need !

I have seen Mel Frequency Cepstral Coefficients (MFFCs) to solve this kind of problem ...

There are other methods, how extract a set of descriptors ( Mean irregularity, Mean Centroid, standard deviation irregularity, MFCC ) and use one classification method (Random Forests, MLP) !

Audio Fingerprint matching - finding closest matches

3 Answers3