7

I've been working with FFT, and I'm currently trying to get a sound waveform from a file with FFT, (modify it eventually), but then output that modified waveform back to a file. I've gotten the FFT of the soundwave and then used an inverse FFT function on it, but the output file doesn't sound right at all. I haven't done any filtering on the waveform - I'm just testing out getting the frequency data and then putting it back into a file - it should sound the same, but it sounds wildly different.

I have since been working on this project a bit, but haven't yet gotten desired results. The outputted sound file is noisy (both more loud, as well as extra noise that wasn't present in the original file), and sound from one channel leaked into the other channel (which was previously silent). The input sound file is a stereo, 2-channel file with sound only coming from one channel. Here's my code:

import scipy
import wave
import struct
import numpy
import pylab

from scipy.io import wavfile

rate, data = wavfile.read('./TriLeftChannel.wav')

filtereddata = numpy.fft.rfft(data, axis=0)
print(data)

filteredwrite = numpy.fft.irfft(filtereddata, axis=0)
print(filteredwrite)

wavfile.write('TestFiltered.wav', rate, filteredwrite)

I don't quite see why this doesn't work.

I've zipped up the problem .py file and audio file, if that can help solve the issue here.

mkrieger1
  • 19,194
  • 5
  • 54
  • 65
SolarLune
  • 1,229
  • 3
  • 12
  • 14
  • 3
    try adding `filteredwrite = numpy.round(filteredwrite).astype('int16')` before you save – Bi Rico Apr 26 '12 at 15:50
  • @Bago - Thanks a lot! That totally fixed the problem. I was wondering, does forcing the filtered ifft to 'int16' mean that it will be a 16-bit depth sound file? – SolarLune Apr 27 '12 at 02:00
  • I don't know too much about wav files, I always assumed they were raw, uncompressed data, but you'll have to read up on the wav format specs to know for sure. – Bi Rico Apr 27 '12 at 17:27

4 Answers4

7
  1. You don't appear to be applying any filter here
  2. You probably want to take the ifft of the fft (post-filtering), not of the input waveform.
wim
  • 338,267
  • 99
  • 616
  • 750
5

Shouldn't it be more like this?

filtereddata = numpy.fft.fft(data)
# do fft stuff to filtereddata
filteredwrite = numpy.fft.ifft(filtereddata)
wavfile.write('TestFiltered.wav', rate, filteredwrite)
John La Rooy
  • 295,403
  • 53
  • 369
  • 502
5
>>> import numpy as np
>>> a = np.vstack([np.ones(11), np.arange(11)])

# We have two channels along axis 0, the signals are along axis 1
>>> a
array([[  1.,   1.,   1.,   1.,   1.,   1.,   1.,   1.,   1.,   1.,   1.],
       [  0.,   1.,   2.,   3.,   4.,   5.,   6.,   7.,   8.,   9.,  10.]])
>>> np.fft.irfft(np.fft.rfft(a, axis=1), axis=1)
array([[  1.1       ,   1.1       ,   1.1       ,   1.1       ,
          1.1       ,   1.1       ,   1.1       ,   1.1       ,
          1.1       ,   1.1       ],
       [  0.55      ,   1.01836542,   2.51904294,   3.57565618,
          4.86463721,   6.05      ,   7.23536279,   8.52434382,
          9.58095706,  11.08163458]])
# irfft returns an even number along axis=1, even though a was (2, 11)

# When a is even along axis 1, we get a back after the irfft.
>>> a = np.vstack([np.ones(10), np.arange(10)])
>>> np.fft.irfft(np.fft.rfft(a, axis=1), axis=1)
array([[  1.00000000e+00,   1.00000000e+00,   1.00000000e+00,
          1.00000000e+00,   1.00000000e+00,   1.00000000e+00,
          1.00000000e+00,   1.00000000e+00,   1.00000000e+00,
          1.00000000e+00],
       [  7.10542736e-16,   1.00000000e+00,   2.00000000e+00,
          3.00000000e+00,   4.00000000e+00,   5.00000000e+00,
          6.00000000e+00,   7.00000000e+00,   8.00000000e+00,
          9.00000000e+00]])

# It seems like you signals are along axis 0, here is an example where the signals are on axis 0
>>> a = np.vstack([np.ones(10), np.arange(10)]).T
>>> a
array([[ 1.,  0.],
       [ 1.,  1.],
       [ 1.,  2.],
       [ 1.,  3.],
       [ 1.,  4.],
       [ 1.,  5.],
       [ 1.,  6.],
       [ 1.,  7.],
       [ 1.,  8.],
       [ 1.,  9.]])
>>> np.fft.irfft(np.fft.rfft(a, axis=0), axis=0)
array([[  1.00000000e+00,   7.10542736e-16],
       [  1.00000000e+00,   1.00000000e+00],
       [  1.00000000e+00,   2.00000000e+00],
       [  1.00000000e+00,   3.00000000e+00],
       [  1.00000000e+00,   4.00000000e+00],
       [  1.00000000e+00,   5.00000000e+00],
       [  1.00000000e+00,   6.00000000e+00],
       [  1.00000000e+00,   7.00000000e+00],
       [  1.00000000e+00,   8.00000000e+00],
       [  1.00000000e+00,   9.00000000e+00]])
Bi Rico
  • 25,283
  • 3
  • 52
  • 75
2

Two problems.

You are FFTing 2 channel data. You should only FFT 1 channel of mono data for the FFT results to make ordinary sense. If you want to process 2 channels of stereo data, you should IFFT(FFT()) each channel separately.

You are using a real fft, which throws away information, and thus makes the fft non-invertible.

If you want to invert, you will need to use an FFT which produces a complex result, and then IFFT this complex frequency domain vector back to the time domain. If you modify the frequency domain vector, make sure it stays conjugate symmetric if you want a strictly real result (minus numerical noise).

hotpaw2
  • 70,107
  • 14
  • 90
  • 153
  • you can fft multi-channel data, you just need to use a 2d array and make sure the axis keyword is set correctly (-1 by default), and `irfft(rfft(n))` should return n (within machine precision). – Bi Rico Apr 19 '12 at 23:20
  • * `irfft(rfft(n))` seems to be best behaved if n.shape[axis] is even. – Bi Rico Apr 19 '12 at 23:37
  • @Bago - Sorry about taking so long to go about this, but could you expand a bit on what you mean? What do you mean by 'use a 2d array'? You mean a NumPy array, right? – SolarLune Apr 25 '12 at 01:22
  • @Bago - I think I'm doing it correctly, but it's difficult to tell. The array is being read in with wavfile.read('./TriLeftChannel.wav'), but the shape is (x, 2), with x being a high number of samples. So, it's already a 2 dimensional array. I'm specifying the axis when I use FFT and IFFT, but it didn't change the output... – SolarLune Apr 25 '12 at 01:36
  • @SolarLune, In cases like this a short example would really help. I think you'll find that if you give enough info on SO to help others reproduce your problem you'll get a lot more feedback. – Bi Rico Apr 25 '12 at 01:41
  • @Bago - You're right. I've posted my problem code above, and made a zip with the .py file and my audio file, in case that's helpful. – SolarLune Apr 26 '12 at 07:02