Is asynchronous text-to-speech voicing with SAPI 5.4 possible?

Question

I have a form and I want to allow the user to receive asynchronous (possibly overlapping) text-to-speech output based on the context of a text box whenever a button is pressed. I'm trying to do this via SAPI 5.4 (Interop.SpeechLib.dll). I recognize that System.Speech or other more "modern" functionality would work much better, but this is my current constraint. Here is a simplified version of my function:

private void VoiceText(string myText)
{
    SpVoice voice = new SpVoice(); // Create new SPVoice instance
    voice.Volume = 100; // Set the volume level of the text-to-speech voice
    voice.Rate = -2; // Set the rate at which text is spoken by the text-to-speech engine
    voice.Speak(text, SpeechVoiceSpeakFlags.SVSFlagsAsync); // Voice text (asynchronously?)
}

Using SVSFlagsAsync DOES allow subsequent code to execute, however the actual voicing always outputs synchronously (no overlapping, and there are brief pauses between voicing instances). I've tried calling this function as an async Task as well as in a separate thread, and still this behavior remains. Is this simply a limitation of SpVoice?

So you're expecting to get multiple, overlapping instances of speech output? That's not really how SAPI works internally. — Eric Brown, Feb 19 '19 at 17:06
Good to know, thanks Eric. Could this be done using System.Speech.Synthesis? — Exergist, Feb 19 '19 at 22:25
Not really, as System.Speech.Synthesis is a nicer wrapper around the underlying SAPI engine. Can I inquire as to *why* you want overlapping synthesis? It seems to me that all you would get would be an unintelligible jumble. — Eric Brown, Feb 19 '19 at 23:27
Basically I want a very responsive "push to voice text to speech" button. As it stands there is a large delay between instances and multiple button pushes exagerates the problem. However my code lies inside another (much) more complex app (VoiceAttack) that does provide a workaround. I have one more question related to SAPI that should be the last for my project, but I'll save it for a separate post. — Exergist, Feb 20 '19 at 03:22
Ah, so what you want is to cancel a running TTS stream. That's a completely different question, and you should probably edit your question to clarify this. — Eric Brown, Feb 21 '19 at 19:24

score 2 · Answer 1 · answered Feb 21 '19 at 19:28

2

You can cancel any currently running TTS request by using the SVSFPurgeBeforeSpeak flag on your Speak call, like this:

voice.Speak(text, SpeechVoiceSpeakFlags.SVSFlagsAsync | SpeechVoiceSpeakFlags.SVSFPurgeBeforeSpeak);

answered Feb 21 '19 at 19:28

Eric Brown

13,774
7
30
71

I realized that I was creating a new SpVoice instance with each button press, which effectively isolated the voicing events. Now all button presses use the same instance of SpVoice, however it seems like the Purge only works up until a certain portion of the current word is being spoken. If I initiate a voicing of "alphabetically" and then quickly press the voicing button the first voicing instance will be interrupted (so far so good). If I briefly pause and allow more of the first word instance to be spoken and THEN activate another voicing the audio output becomes essentially synchronous. – Exergist Feb 23 '19 at 21:59
@Exergist Hm. You might need to set an event sink and make sure you can pump messages. It's been a while since I've dug heavily into SAPI internals. – Eric Brown Feb 25 '19 at 19:30
What do you mean by "event sink?" yes using SAPI here feels like shoving a big square through a small circle. – Exergist Feb 26 '19 at 00:40
I concluded that SAPI 5.4 is not well suite for what I'm trying to do. I will revise my code to leverage System.Speech and the `SpeakAsyncCancelAll()` and `SpeakAsync("my text")`. – Exergist Mar 14 '19 at 16:54
@Exergist You do know that System.Speech is a wrapper around SAPI, right? – Eric Brown Mar 14 '19 at 17:47
Indeed, and that makes this more aggravating that I can't get it to work as intended without System.Speech. I posted a new question with my current method here: https://stackoverflow.com/questions/55168576/how-to-properly-dispose-of-speechsynthesizer-for-async-text-to-speech – Exergist Mar 14 '19 at 19:13
Basically consecutive button presses interrupt and restart the voicing. I can interrupt the spoken text at any time, whereas like I mentioned previously SAPI only allows me to interrupt if maybe <50% of the text has been spoken. If there's a way to get that working in SAPI I'm all ears. – Exergist Mar 14 '19 at 19:20

Is asynchronous text-to-speech voicing with SAPI 5.4 possible?

1 Answers1

Linked