3

I have a conceptual problem.

I know what is a mel scale and what it represent and I know that this kind of spectrogram still has too much information for what I need.

I think that if we want reduce the number of information of the spectrogram we use the MFCC.

But I really don't get what the MFCC is and what it represent? I use a MFCC matrix in a speech recognition process, but I don't understand what all of the number inside that vector represent.

The array is 13x130 and I don't know what all these float mean. I understood that more long is my audio track bigger is my matrix (e.g 13x250, 13x400).

I hope that I make myself clear.

Anthos89
  • 87
  • 1
  • 1
  • 8
  • 1
    This article looks like a good start http://practicalcryptography.com/miscellaneous/machine-learning/guide-mel-frequency-cepstral-coefficients-mfccs/ i hope it can help you! – Maantje Nov 26 '15 at 23:19
  • Thank you, it was helpful but I feel that I don't get the full practical concept. "The MFCC feature vector describes only the power spectral envelope of a single frame, but it seems like speech would also have information in the dynamics i.e. what are the trajectories of the MFCC coefficients over time", what are the trajectories of MFCC? – Anthos89 Nov 26 '15 at 23:36

0 Answers0