0

I'm working on a research project in which we create a new text-to-speech (TTS) engine, that converts text to spoken audio. As the engine is already performing good, we try to make it usable by a large number of applications which made us want the engine to show up as a TTS voice on Windows 10.

In Microsoft's developer documentations, all I found was information on how I can use exisiting/already installed voices in my application. However, I didn't find any information on how to implement a voice so that it shows up as a Windows voice and can be used by any application using the Speech SDK or SAPI.

Which interface do I have to implement or what API do I have to connect to in order to get our new TTS engine work with Windows Speech?

I already crawled the documentation of the Microsoft Speech SDK as well as developer sites like https://learn.microsoft.com/en-us/dotnet/api/system.speech.synthesis.ttsengine

Xie Steven
  • 8,544
  • 1
  • 9
  • 23
MiH
  • 115
  • 2
  • 13

1 Answers1

2

You should look at the TTS Engine Vendor Porting Guide. You need to implement ISpTTSEngine, which does all the work, and ISpObjectWithToken, which manages registration and creation.

Eric Brown
  • 13,774
  • 7
  • 30
  • 71
  • Why isn't this marked as the answer? Seems like the answer, and if the linked page is not omitting something, the ISpTTSEngine interface seems a lot simpler than I had expected. It only has two methods to implement. – Damn Vegetables Jun 22 '21 at 01:34
  • Yeah, implementing a TTS engine is straightforward. (Way, way, way easier than implementing an SR engine.). That being said, Microsoft isn’t making further investments into SAPI. – Eric Brown Jun 23 '21 at 02:44