I am trying to write an audio fingerprinting library for educational purpose. Its based on Computer Vision for Music Identification . I have a couple of questions relating to the contents of the paper.
I know that two bytes represents a sample, so i wrote this class to extract the samples from a pcm file. I'd like to know if this is right (sorry if its too obvious :) ).
class FingerPrint: def __init__(self, pcmFile): self.pcmFile = pcmFile self.samples = [] self.init() def init(self): # Current samples currentSamples = [] # Read pcm file with open(self.pcmFile, 'rb') as f: byte = f.read(2) while byte != '': self.samples.append(byte) byte = f.read(2) fp = FingerPrint('output.pcm')
If the above code is ok, then according to the book the next thing to do is to convolve the signal with a low pass filter and take every 8th sample. I don't understand these and why this has to be done, it would be awesome if someone could help me understand (with codes if possible)