Does Google's WaveNet support phonetic input (SSML phoneme elements)?

Question

I am working with a product that uses phonetic input to make TTS generate proper pronunciations for names. I don't see phoneme tags in Google's WaveNet TTS documentation https://cloud.google.com/text-to-speech/docs/ssml, but perhaps I'm missing it.

If any developers for Google are listening, can they share plans to add phonetic input? Tnx

Google did not support phonemes at the time you asked, but they support it now. See this answer: https://stackoverflow.com/a/69374316/39946 — Lena Schimmel, Oct 06 '21 at 08:25

score 0 · Answer 1 · answered Jul 20 '20 at 23:52

0

Since they're based on neural networks from "end to end" (text -> net -> sound), they probably never did a phoneme step like (text -> phoneme -> net -> sound).

This is highly expected, as this phoneme selection should be the job of the neural network, eliminating unnecessary phases.

answered Jul 20 '20 at 23:52

Daniel Möller

84,878
18
192
214

1

Undoubtedly true, but all TTS systems need an override for specialized vocabularies (proper names, technical terms). Regardless of the quality of the rules, there will always be a need. – murspieg Jul 22 '20 at 01:09

Does Google's WaveNet support phonetic input (SSML phoneme elements)?

1 Answers1