SSML support in Android TTS?

Question

This question was asked several years ago, but hopefully things have changed...

Could someone point me to any details about which versions of Android support what subset of SSML in its Text-To-Speech engine?

I did some experiments using Flutter-TTS, which is just a layer that passes the text to be spoken to the underlying platform TTS service.

Some unknown subset of SSML does work on a newer Android - for example, this SSML

<speak>before<break time="5s"/>after</speak>

does indeed produce a five-second pause between the words on API 27 and API 29. It does not work on API 21, but at least it handles it gracefully by just ignoring all tags. I have not tested other API levels yet. I also tried prosody, phoneme, and lang tags - they seem to be not working.

Kiran, I am looking for any documentation that describes the level of SSML support in Android Text-To-Speech engine. See https://www.w3.org/TR/speech-synthesis11 for full SSML spec. — AlexR, Jun 17 '20 at 19:14
I do not understand the connection between the Readium SDK (it is for creating ePub, right?) and my question. — AlexR, Jun 17 '20 at 19:26

Nerdy Bunz · Answer 1 · 2020-06-20T06:24:43.043

As long as the speak() method of the TextToSpeech class only accepts Strings or CharSequence, (which are then passed on to the speech engine), and as long as Android doesn't introduce a new method like TextToSpeech.speakSSML() in some future version of Android (and enforce that all engines must support it)...

...then the way in which individual speech engines process these Strings will be unique to them and ultimately unpredictable because you don't know what engine (or version of it) the user may have installed and/or selected for use in their settings until runtime.

Sure, a certain engine may soon claim to fully support SSML, and it sounds like maybe the Google engine (if that's what you're testing with) is beginning to experiment with it. Even if they do fully end up supporting it, you would have to prompt the user to reconfigure/install that engine particular version of that particular engine in your app.

I suspect the reason you're seeing it supported is because you're testing with some google engine "network voices" which have some cross-over with google cloud tts.

I don't think Android API level should have anything to do with it, it's just going to be whether a user has "xyz engine ver 234123.21314" installed or not.

Google cloud text-to-speech does support it, though... and since you're already going cross platform using Flutter, that might be a better way to go... but of course, it requires active network.

SSML support in Android TTS?

1 Answers1

Linked