Building a new voice for Festival using HTS

Question

I am working on a project to build a Synthesizer for my local language using an HMM-based approach. So far, I have been able to generate a forced alignment file (aligned.mlf) as explained in the HTK Book. However, I fail to find any step by step instructions on how to build the synthesizer using HTS. What I have done is to download the sample Speaker Dependent Demo on the HTS website and trained that data. What I have in the voice folder is a cmu_us_arctic_slt.htsvoice file. So my 2-part question is:

1) How do I use this file as a voice in Festival?

2) How can I generate the label and utt files needed to train my voice from the forced alignment file I have?

Any help will be greatly appreciated. Thanks.

score 1 · Answer 1 · answered Dec 07 '16 at 11:59

You have to implement festival language first (write scheme files) and build a unitselection voice as described in documentation.

You need voice_lex.scm, voice_pos.scm, voice_clunits.scm and few more.

You generate required files like utts in the course of unit selection voice creation.

score 1 · Answer 2 · answered Dec 13 '16 at 02:37

Building new voice is quite hard work. I am also on progress building my local language voice. Hope these link help :

[1] http://www.cstr.ed.ac.uk/projects/festival/manual/festival_24.html

[2] http://www.cs.ru.ac.za/research/groups/vrsig/pastprojects/049speechsynthesis/paper04.pdf

Building a new voice for Festival using HTS

2 Answers2