convolving an audio signal

Question

I am trying to write an audio fingerprinting library for educational purpose. Its based on Computer Vision for Music Identification . I have a couple of questions relating to the contents of the paper.

I know that two bytes represents a sample, so i wrote this class to extract the samples from a pcm file. I'd like to know if this is right (sorry if its too obvious :) ).

class FingerPrint:

   def __init__(self, pcmFile):
      self.pcmFile = pcmFile
      self.samples = []
      self.init()


   def init(self):
      # Current samples
      currentSamples = []

      # Read pcm file
      with open(self.pcmFile, 'rb') as f:
         byte = f.read(2)
         while byte != '':
           self.samples.append(byte)
           byte = f.read(2)

fp = FingerPrint('output.pcm')

If the above code is ok, then according to the book the next thing to do is to convolve the signal with a low pass filter and take every 8th sample. I don't understand these and why this has to be done, it would be awesome if someone could help me understand (with codes if possible)

score 2 · Accepted Answer · answered Aug 20 '12 at 01:32

After read the two bytes, you need to convert it into int. You can use struct module.

But I think you should use NumPy, SciPy:

To read wave file, you can call scipy.io.wavfile.read()

http://docs.scipy.org/doc/scipy-0.10.0/reference/tutorial/io.html#module-scipy.io.wavfile

If your file is raw PCM data, you can call numpy.fromfile()

http://docs.scipy.org/doc/numpy/reference/generated/numpy.fromfile.html

for example:

data = numpy.fromfile("test.pcm", dtype=np.int16)

To design lowpass filter, you can use filter design functions in scipy.signal:

http://docs.scipy.org/doc/scipy-0.10.1/reference/signal.html#filter-design

To do the convolve, you can use convoliving functions in scipy.signal:

http://docs.scipy.org/doc/scipy-0.10.1/reference/signal.html#convolution

There is also a convolve function in numpy:

http://docs.scipy.org/doc/numpy/reference/generated/numpy.convolve.html

Thanks you so much. would `numpy.fromfile` read two bytes as `int` to represent a sample? if so, could it be made to read the bytes as floats? — Kennedy, Aug 20 '12 at 02:03
You can set the dtype argument to np.float32 or np.float64 for float data. — HYRY, Aug 20 '12 at 03:37

score 1 · Answer 2 · answered Aug 20 '12 at 02:41

It sounds like the algorithm you're using is doing a filter-and-decimate operation to reduce the sample rate of the data by a factor of 8. This results in fewer samples being fed to other downstream functions that may be computationally expensive but which do not need the full bandwidth of the input data. The convolution function you reference is performing the low pass filtering of the input data using the impulse filter response corresponding to the desired filter shape. These are standard signal processing operations which you should be able to read up on in any text on digital signal processing.

convolving an audio signal

2 Answers2