training model in english, but evaluating in another language for speech diarization task?

Question

For speech diarization task, can i just train my model on english-based dataset (utterances of single words), but evaluate in my language? Or this does not make sense, and the model will show poor results?

I need to implement a rather simple model, but there is no dataset in my language.

score 0 · Answer 1 · answered Jan 23 '20 at 11:58

0

Which language?
If the language is "close" ( linguistic speaking) to English , than you might get pretty reasonable results .. I would suggest to look for data-sets in the closest popular language

answered Jan 23 '20 at 11:58

Itai Peer

21
3

there is no dataset in Russian ? I thought you are talking about some minor languege with less than a milion speakers , just google "russian speech corpora" – Itai Peer Jan 23 '20 at 20:17
Also , Russian is far from English , but not as far as the Asian language for example. – Itai Peer Jan 23 '20 at 20:18

training model in english, but evaluating in another language for speech diarization task?

1 Answers1