I have a matrix of 21000x13 of mfccs from a wav file. I have a label file which has the start time end time and label of that time period in a text file. I need to find the time for each frame in the mfcc matrix so labels can be used for each frame. Does anyone know the sampling rate (30ms/50ms/20ms) and the overlap (30%/40%/50%). So that I can find the time in which each frame fall using the frame number X sampling rate +/- the overlap will give the actual time for the frame. eg. 1x20ms = 20ms and the next frame would be at the time 2x20=40 but will have to consider the overlap here so it will be 30 if 50% overlap.
Asked
Active
Viewed 1,297 times
1 Answers
1
Default samping rate is 11025 Hz
Default frame size is the highest power of 2 which is less than 0.03 * sampling rate. For default samping rate the frame size is 256 samples. You can use this formula for calculation:
pow2(floor(log2(0.03*fs)))
Default overlap is 50%.
So the default frame increment is 128 samples. To get the offset you need to multiply the frame number on frame shift (128) and divide by sample rate (11025).
You can find the details in the header here
http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/doc/voicebox/melcepst.html

Nikolay Shmyrev
- 24,897
- 5
- 43
- 87
-
so if I use defaults in computing mfcc then what will be the frame length in milliseconds? – chris Feb 18 '14 at 11:07
-
The frame size 23.2ms the frame shift is 11.6 ms – Nikolay Shmyrev Feb 18 '14 at 16:04
-
which means the if the first frame starts at 0 milliseconds the next one will start at 11.6ms? – chris Feb 19 '14 at 19:59