1

I am working with a product that uses phonetic input to make TTS generate proper pronunciations for names. I don't see phoneme tags in Google's WaveNet TTS documentation https://cloud.google.com/text-to-speech/docs/ssml, but perhaps I'm missing it.

If any developers for Google are listening, can they share plans to add phonetic input? Tnx

murspieg
  • 144
  • 2
  • 14
  • Google did not support phonemes at the time you asked, but they support it now. See this answer: https://stackoverflow.com/a/69374316/39946 – Lena Schimmel Oct 06 '21 at 08:25

1 Answers1

0

Since they're based on neural networks from "end to end" (text -> net -> sound), they probably never did a phoneme step like (text -> phoneme -> net -> sound).

This is highly expected, as this phoneme selection should be the job of the neural network, eliminating unnecessary phases.

Daniel Möller
  • 84,878
  • 18
  • 192
  • 214
  • 1
    Undoubtedly true, but all TTS systems need an override for specialized vocabularies (proper names, technical terms). Regardless of the quality of the rules, there will always be a need. – murspieg Jul 22 '20 at 01:09