GMM and MFCC for language identification

Asked Nov 10 '18 at 11:31

Active Nov 10 '18 at 11:31

Viewed 175 times

I am new to machine learning domain. Currently, I am trying to implement a audio language detection system, based on MFCC, delta, delta delta and Mel Spectrum Coefficients of any audio file. These features are extracted using librosa. Librosa returns a 2D matrix of MFCC's. The problem is that I want to train them on a Gaussian Mixture Model. The Sci-kit library takes the input in the format (n_samples, n_features), but I have a D matrix of the form (n_samples, n_mfcc, n_time) as returned by librosa.features.mfcc(). How can i provide a 3D input to a GMM?

Also is there a way so that I can send all the 4 features mentioned above into the model?

asked Nov 10 '18 at 11:31

Amit K.S

I think you should provide a [n_samples x n_mfcc] matrix for each n_time. – Nov 16 '18 at 18:25

GMM and MFCC for language identification

0 Answers0