Which feature, algorithm is good for Speaker Verification

Question

I have a task with speaker verification.

My task is calculate the similarity between two audio speech voice, then compare with a threshold. Ex: similarity score between two audio is 70%, threshold is 50%. Hence the speaker is the same person.

The speech is text-independent, it's can be any conversation.

I have experiment in using MFCC, GMM for speaker recognition task, but this task is difference, just compare two audio feature to have the similarity score. I don't know which feature is good for speaker verification and which algorithm can help me to calculate similarity score between 2 patterns.

Hope to have you guys's advices,

Many thanks.

score 1 · Accepted Answer · answered Jan 25 '18 at 16:00

1

State of the art these days is xvectors:

Deep Neural Network Embeddings for Text-Independent Speaker Verification

Implementation in Kaldi is here.

answered Jan 25 '18 at 16:00

Nikolay Shmyrev

24,897
5
43
87

score 0 · Answer 2 · answered Mar 01 '18 at 06:28

I am also working on TIMIT Dataset for speaker verification. I have extracted mfcc features and trained a UBM for same, and adapted for each speaker.When it comes to adaptation I have used diagonal matrix. How are you testing the wav files? However, when it comes to features you can use pitch and energy.

Which feature, algorithm is good for Speaker Verification

2 Answers2