0

I am using unity + MRTK to develop an application for HoloLens 2. I am trying to use "speech styles" for MRTK TextToSpeech.SpeakSsml method (MRTK API Reference). Text to speech works; however, I am unable to employ speech styles. Example ssml:

<speak version=""1.0"" xmlns=""http://www.w3.org/2001/10/synthesis"" xmlns:mstts=""https://www.w3.org/2001/mstts"" xml:lang=""en-US"">
    <mstts:express-as style=""cheerful"">
      Cheerful hello!
    </mstts:express-as>
    <break time=""1s"" />
    <mstts:express-as style=""angry"">
      Angry goodbye!
    </mstts:express-as>
</speak>

My guess is that the default voice does not support speech styles. But, if I add a voice element to use another voice (there are four available voices listed in the documentation), TextToSpeech won't work at all. So, I am facing two problems:

  1. When using the SpeakSsml method instead of StartSpeaking, the selected voice (TextToSpeech.Voice) is disregarded and I am unable to change it using the voice element.
  2. I couldn't find documentation for supported SSML elements for available voices in MRTK TextToSpeech Class.

Any ideas or useful links?

Thank you!

Ahmadreza
  • 18
  • 2

1 Answers1

1

The TextToSpeech provided by MRTK depends on Windows 10 SpeechSynthesizer class, so it works offline and does not support adjust speaking styles. And the mstts:express-as element is only available in the Azure Speech Service, for more information please refer to this documentation: Improve synthesis with Speech Synthesis Markup Language (SSML)

Hernando - MSFT
  • 2,895
  • 1
  • 5
  • 11
  • That explains it, thank you. However, I still have the problem with changing the voice while using ssml. When using method SpeakSsml rather than StartSpeaking it disregards the selected voice. According to the documentation there are four available voices (second link in the question description) – Ahmadreza Jan 22 '22 at 09:12
  • 1
    TextToSpeech.Voice should work, could you deploy [SpeechRecognitionAndSynthesis](https://github.com/Microsoft/Windows-universal-samples/tree/main/Samples/SpeechRecognitionAndSynthesis) to your device to try again? I suppose the issue is caused by the SSML string without referencing appropriate target namespaces. – Hernando - MSFT Jan 24 '22 at 09:18