0

I have an audio file and read all data from the sound card buffer. Then, I convert byte[] to float[] to use them for hamming window. The waveform of the audio is:

https://i.stack.imgur.com/2NhTB.png

after using hamming window:

https://i.stack.imgur.com/N87qE.png

is the waveform of audio with hamming window right? Where is my mistake?

by the way i use naudio library to process audio:

WaveChannel32 wave = new WaveChannel32(new WaveFileReader("sesDosyası.wav"));
byte []buffer = new byte[wave.length];
float []data = new float[wave.length / 4];
int read = wave.Read(buffer, 0, wave.length);
for (int i = 0; i < read / 4; i++)
{   
    data[i] = BitConverter.ToSingle(buffer, i * 4); //converting byte to float
    chart1.Series["wave"].Points.Add(data[i]); //first waveform
}

for (int j = 0; j < read/4; j++)
{
   data[j] = (float)(0.54 - 0.46 * Math.Cos((2 * Math.PI * data[j]) / (read / 4 - 1)));//hamming
   chart2.Series["wave"].Points.Add(data[j]); //second waveform
}
Qiu
  • 5,651
  • 10
  • 49
  • 56
Cengaver
  • 87
  • 2
  • 9

2 Answers2

2

It appears you are applying the window to the whole wave, so read is going to be huge, so the term inside the cos is always going to be very near 0 for a data between [-1,1].

So you are always getting .54 - .46*cos(0) = .54 - .46*1.0 = .08

From wikipedia, only j should be inside the cosine - that gives the window, which you then multiply by data[j]:

window =  0.54 - 0.46 * Math.Cos( (2*Math.PI * j)/(total - 1) ).
hammed_signal = data[j]*window;

Why are you trying to apply a hamming window to the whole wave?

AShelly
  • 34,686
  • 15
  • 91
  • 152
  • how much should be size of my blocks? i mean if read data by 4096 byte array blocks, will it be solved? – Cengaver Mar 01 '13 at 15:12
  • If you replace `data[j]` with `j`, you will get the right window. The right block size depends on what you want to do. – AShelly Mar 01 '13 at 15:17
  • i read audio data by 16 byte array and convert float arrays. i use hamming for each data block and it generates: http://c1303.hizliresim.com/17/1/kmnc7.png – Cengaver Mar 01 '13 at 15:28
  • ok ok nevermind i understand now.. i must use your code. thank u so much! i have a last question. when i use "n" bytes array block to read the data the waveform with hamming is like that; n=16 => http://d1303.hizliresim.com/17/1/kmnzq.png n=4096 =>http://d1303.hizliresim.com/17/1/kmp0z.png and n=16384 => http://d1303.hizliresim.com/17/1/kmp1s.png how much should be size of my blocks? – Cengaver Mar 01 '13 at 15:43
  • No one can tell you what block size to use unless you tell us why you are hamming? What audio effect are you trying to achieve? – AShelly Mar 01 '13 at 16:29
  • i want to extract feature of an audio by using these steps which some of them makes wave of the audio better to analysis: 1-frame blocking 2-windowing 3-fft 4-Mel-Frequency Warping 5-Cepstrum – Cengaver Mar 01 '13 at 16:50
  • See http://stackoverflow.com/a/5570485/10396 for a good explanation of window sizes for FFT. Basically you need to balance the frequency resolution you need after the FFT with the time resolution needed after you convert back. If you FFT the whole wave, you get super-detailed frequency resolution, but you get one big smear of sound when you convert back to a wave. – AShelly Mar 01 '13 at 17:10
  • actually i will fft to area of speech in the wave not whole wave. thanks for all your helping – Cengaver Mar 01 '13 at 17:27
1

I think your hamming line is wrong:

data[j] = (float)(0.54 - 0.46 * Math.Cos((2 * Math.PI * data[j]) / (read - 1)));

As your loop goes from 0 to read/4, and you are dividing by read, so if you have read 16 samples, your for loop looks only at the first 4, but divides by 15, not 3.

var total = read / 4;

for (int j = 0; j < total; j++)
{
   data[j] = (float)(0.54 - 0.46 * Math.Cos((2 * Math.PI * data[j]) / (total- 1)));//hamming
   chart2.Series["wave"].Points.Add(data[j]); //second waveform
}
Pondidum
  • 11,457
  • 8
  • 50
  • 69