Implementing a TTS service for Windows 10

Question

I'm working on a research project in which we create a new text-to-speech (TTS) engine, that converts text to spoken audio. As the engine is already performing good, we try to make it usable by a large number of applications which made us want the engine to show up as a TTS voice on Windows 10.

In Microsoft's developer documentations, all I found was information on how I can use exisiting/already installed voices in my application. However, I didn't find any information on how to implement a voice so that it shows up as a Windows voice and can be used by any application using the Speech SDK or SAPI.

Which interface do I have to implement or what API do I have to connect to in order to get our new TTS engine work with Windows Speech?

I already crawled the documentation of the Microsoft Speech SDK as well as developer sites like https://learn.microsoft.com/en-us/dotnet/api/system.speech.synthesis.ttsengine

score 2 · Answer 1 · answered Apr 08 '19 at 18:37

2

You should look at the TTS Engine Vendor Porting Guide. You need to implement ISpTTSEngine, which does all the work, and ISpObjectWithToken, which manages registration and creation.

answered Apr 08 '19 at 18:37

Eric Brown

13,774
7
30
71

Why isn't this marked as the answer? Seems like the answer, and if the linked page is not omitting something, the ISpTTSEngine interface seems a lot simpler than I had expected. It only has two methods to implement. – Damn Vegetables Jun 22 '21 at 01:34
Yeah, implementing a TTS engine is straightforward. (Way, way, way easier than implementing an SR engine.). That being said, Microsoft isn’t making further investments into SAPI. – Eric Brown Jun 23 '21 at 02:44

Implementing a TTS service for Windows 10

1 Answers1