1

I have a task with speaker verification.

My task is calculate the similarity between two audio speech voice, then compare with a threshold. Ex: similarity score between two audio is 70%, threshold is 50%. Hence the speaker is the same person.

The speech is text-independent, it's can be any conversation.

I have experiment in using MFCC, GMM for speaker recognition task, but this task is difference, just compare two audio feature to have the similarity score. I don't know which feature is good for speaker verification and which algorithm can help me to calculate similarity score between 2 patterns.

Hope to have you guys's advices,

Many thanks.

Can Nguyen
  • 11
  • 2

2 Answers2

1

State of the art these days is xvectors:

Deep Neural Network Embeddings for Text-Independent Speaker Verification

Implementation in Kaldi is here.

Nikolay Shmyrev
  • 24,897
  • 5
  • 43
  • 87
0

I am also working on TIMIT Dataset for speaker verification. I have extracted mfcc features and trained a UBM for same, and adapted for each speaker.When it comes to adaptation I have used diagonal matrix. How are you testing the wav files? However, when it comes to features you can use pitch and energy.