-1

I am trying to classify the input audio samples based on the spoken language based on MFCC features.The number of spoken languages taken into consideration is two.

What i have tried so far:

n_components ranging from 32 to 512

Result

My try was not able to classify the audio samples exactly. Now, I am not clear whether the idea above is right and how to choose the number of components so that the result is improved.

mah
  • 29
  • 5

1 Answers1

0

Adjusting number of components could be done by observing component responsibility, it means checking how many samples are assigned to this component based on the higher probability of generation from this distribution was criteria. After increasing n_components you will see that number of samples which the most probably generated by components decrease. If responsibility is too small it could be a sign of overfiting.

Could you explain how your classifier exactly works?

GMM is a generative model and can't be directly used in a classification problem. The most common mistake is to take in consideration that one Gaussian component should correspond to one class.

podludek
  • 279
  • 2
  • 10
  • First, I train the gmm to obtain the language models based on mfcc features of the audio input. Then, I test the audio inputs against the language models to identify the spoken language. – mah Jan 30 '19 at 13:12