1

I am using MaryTTS as a text to speech engine inside a Grails Application. During app testing I found out that the language quality drastically changes (for the worst) with increasing text length when using a HMM voice.

So naturally I tested via the MARY Web Client while tweeking all HMM relevant parameters (F0Add, F0Scale and Rate) as well as removing them or leaving the default values, but to no success.

The voice I am using is bits1-hsmm:5.2 (German Female)

gradle dependency:

compile "de.dfki.mary:voice-bits1-hsmm:5.2"

The code is as simple as:

def marytts = new LocalMaryInterface()
marytts.locale = Locale.GERMAN
marytts.generateAudio text

Everything works fine up to the point where the text to convert goes over 120 characters (not only in the code but also via the Mary Web Client)

Here the text I used for the last tests:

Baumaßnahmen im Mai und Oktober Notwendige Instandhaltungsarbeiten an der Münchner S-Bahn-Stammstrecke sollen von nun an gebündelt stattfinden. Die Bahn möchte dadurch die baubedingten Fahrplaneinschränkungen durch gesperrte Gleise geringer halten.

To see the difference in quality use a part of the text (first couple words) vs the whole.

Another important point: This does not occur when using a Unit Selection voice .

Am I missing something like a configuration or specific parameter set or is this the standard behaviour of HMM voices inside MaryTTS?

It will be great to be able to use this voice with decent quality, since Unit Selection voices are not available as standalone dependencies and having to split the text in smaller parts and play them sequentially is not really something I would consider.

Any input is appreciated.

Update

Further trial and error showed that the robotic background sound is added when the text contains punctoation marks such as . , : ; [ ] { }. Independent of text length! Not really sure what the root cause is but atleast with a text manipulation before the conversion the voice is useable.

Dmitrii Sidenko
  • 660
  • 6
  • 19
D.Ivanov
  • 11
  • 4

0 Answers0