-5

I'm trying to get volume-time graph of .wav file. First, I recorded sound (patient exhalations) via android as .wav file, but when I read this .wav file in MATLAB it has negative values. What is the meaning of negative values? Second, MATLAB experts could you please check if the code below does the same as written in my comments? Also another question. Y = fft(WindowArray); p = abs(Y).^2; I took the power of values returned from fft...is that correct and what is the goal of this step??

[data, fs] =  wavread('newF2');
% read exhalation audio wav  file (1 channel, mono)
% frequency is 44100 HZ
% windows of 0.1 s and overlap of 0.05 seconds
WINDOW_SIZE = fs*0.1; %4410 = fs*0.1
array_size = length(data); % array size of data
numOfPeaks = (array_size/(WINDOW_SIZE/2)) - 1;
step = floor(WINDOW_SIZE/2); %step size used in loop
transformed = data;
start =1;
k = 1;
t = 1;
g = 1;
o = 1;
% performing fft on each window and finding the peak of windows 
while(((start+WINDOW_SIZE)-1)<=array_size) 
    j=1;
    i =start;
    while(j<=WINDOW_SIZE)
        WindowArray(j) = transformed(i);
        j = j+1;
        i = i +1;
    end
    Y = fft(WindowArray);
    p = abs(Y).^2; %power
      [a, b] = max(abs(Y)); % find max a and its indices b
      [m, i] = max(p); %the maximum of the power m and its indices i
      maximum(g) = m;
      index(t) = i;
      power(o) = a;
      indexP(g) = b;
      start = start + step;
      k = k+1;
      t = t+1;
      g = g+1;
      o=o+1;  
  end
% low pass filter 
% filtering noise: ignor frequencies that are less than 5% of maximum frequency
for u=1:length(maximum)
    M = max(maximum); %highest value in the array
    Accept = 0.05* M;
    if(maximum(u) > Accept)
        maximum = maximum(u:length(maximum));
        break;
    end
end
% preparing the time of the graph, 
% Location of the Peak flow rates are estimated
TotalTime = (numOfPeaks * 0.1);
time1 = [0:0.1:TotalTime];
if(length(maximum) > ceil(numOfPeaks));
maximum = maximum(1:ceil(numOfPeaks)); 
end
time = time1(1:length(maximum));
% plotting frequency-time graph
figure(1);
plot(time, maximum);
ylabel('Frequency');
xlabel('Time (in seconds)');
% plotting volume-time graph
figure(2);
plot(time, cumsum(maximum)); % integration over time to get volume 
ylabel('Volume');
xlabel('Time (in seconds)');
JohnColtraneisJC
  • 198
  • 1
  • 13

2 Answers2

0

(I only answer the part of the question which I understood)

Per default Matlab normalizes the audio wave to - 1...1 range. Use the native option if you want the integer data.

Daniel
  • 36,610
  • 3
  • 36
  • 69
0

First, in your code it should be p = abs(Y)**2, this is the proper way to square the values returned from the FFT. The reason why you take the absolute value of the FFT return values is because those number are complex numbers with a Real and Imaginary part, therefore the absolute value (or modulus) of an imaginary number is the magnitude of that number. The goal of taking the power could be for potentially obtaining an RMS value (root mean squared) of your overall amplitude values, but you could also have something else in mind. When you say volume-time I assume you want decibels, so try something like this:

def plot_signal(file_name):

sampFreq, snd = wavfile.read(file_name)

snd = snd / (2.**15) #Convert sound array to floating point values 
                     #Floating point values range from -1 to 1

s1 = snd[:,0] #left channel

s2 = snd[:,1] #right channel

timeArray = arange(0, len(snd), 1)
timeArray = timeArray / sampFreq
timeArray = timeArray * 1000  #scale to milliseconds

timeArray2 = arange(0, len(snd), 1)
timeArray2 = timeArray2 / sampFreq
timeArray2 = timeArray2 * 1000  #scale to milliseconds

n = len(s1)
p = fft(s1) # take the fourier transform 

m = len(s2) 
p2 = fft(s2)

nUniquePts = ceil((n+1)/2.0)
p = p[0:nUniquePts]
p = abs(p)

mUniquePts = ceil((m+1)/2.0)
p2 = p2[0:mUniquePts]
p2 = abs(p2)

'''
Left Channel
'''
p = p / float(n) # scale by the number of points so that
             # the magnitude does not depend on the length 
             # of the signal or on its sampling frequency  
p = p**2  # square it to get the power 




# multiply by two (see technical document for details)
# odd nfft excludes Nyquist point
if n % 2 > 0: # we've got odd number of points fft
    p[1:len(p)] = p[1:len(p)] * 2
else:
    p[1:len(p) -1] = p[1:len(p) - 1] * 2 # we've got even number of points fft

plt.plot(timeArray, 10*log10(p), color='k')
plt.xlabel('Time (ms)')
plt.ylabel('LeftChannel_Power (dB)')
plt.show()

'''
Right Channel
'''
p2 = p2 / float(m) # scale by the number of points so that
             # the magnitude does not depend on the length 
             # of the signal or on its sampling frequency  
p2 = p2**2  # square it to get the power 




# multiply by two (see technical document for details)
# odd nfft excludes Nyquist point
if m % 2 > 0: # we've got odd number of points fft
    p2[1:len(p2)] = p2[1:len(p2)] * 2
else:
    p2[1:len(p2) -1] = p2[1:len(p2) - 1] * 2 # we've got even number of points fft


plt.plot(timeArray2, 10*log10(p2), color='k')
plt.xlabel('Time (ms)')
plt.ylabel('RightChannel_Power (dB)')
plt.show()

I hope this helps.

JohnColtraneisJC
  • 198
  • 1
  • 13