5

One can let the SpeechSynthesizer speak text in an asynchronous way, for example like this:

Private WithEvents _Synth As New SpeechSynthesizer

Private Sub TextBox1_KeyUp(sender As Object, e As KeyEventArgs) Handles TextBox1.KeyUp
    If e.KeyCode = Keys.Enter Then
        _Synth.SpeakAsync(New Prompt(Me.TextBox1.Text))
    End If
End Sub

The events that SpeechSynthesizer generates enables us to tell what the computer voice is just speaking.

For example, you may visualize the speech output by selecting the characters like this:

Private Sub _Synth_SpeakProgress(sender As Object, e As SpeakProgressEventArgs) Handles _Synth.SpeakProgress

    Me.TextBox1.SelectionStart = e.CharacterPosition
    Me.TextBox1.SelectionLength = e.CharacterCount

End Sub

However, when SpeakAsync is called repeatedly (for example when we tell the SpeechSyntesizer to speak the same text while it's currently just speaking), the speech requests are queued, and the SpeechSynthesizer plays them one by one.

However, I haven't been able to find out which request the synthesizer is currently speaking. The SpeakProgressEventArgs don't reveal this:

Using SAPI5, the events provided a StreamNumber:

Parameters
StreamNumber
    The stream number which generated the event. When a voice enqueues more than one stream by speaking asynchronously, the stream number is necessary to associate an event with the appropriate stream.

Using this StreamNumber, you could always tell what the SpeechSynthesizer is just playing / speaking.

The System.Speech.Synthesis implementation is a modern version of the SAPI5 implementation.

However, I just don't find a StreamNumber indiciator or similiar information.

System.Speech.Synthesis provides information about just everything that is just happening, so it's highly unlikely that it doesn't provide the information which of the requests it's just processing.

How could this be retrieved?

Justin Lessard
  • 10,804
  • 5
  • 49
  • 61
tmighty
  • 10,734
  • 21
  • 104
  • 218
  • Why can you not track this via the `Prompt` argument of `SpeakAsync`? The `SpeakProgressEventArgs` includes the `Prompt` as does the `SpeakCompletedEventArgs`. – TnTinMn Apr 11 '19 at 19:55
  • @TnTinMn I have tried that, but I haven't found any way to distinguish one prompt from another if the text of both is the same. So if you feed 5 "Hello!" prompts to the SpeechSynthesizer, all 5 look the same in the SpeechProgress event. What do you think about that? – tmighty Apr 12 '19 at 12:38
  • The [Prompt Class](https://learn.microsoft.com/en-us/dotnet/api/system.speech.synthesis.prompt?view=netframework-4.7.2) is not sealed, so you can make a derived class (`Inherits Prompt`) that has all the identifying properties you want and pass that to `SpeakAsync`. – TnTinMn Apr 12 '19 at 13:33

2 Answers2

1

To clarify my comment about using the Prompt Class to hold any identifying state you need, consider the following where the Prompt holds a reference to the source TextBox.

Imports System.Speech.Synthesis
Public Class MyPrompt : Inherits Prompt
    Private tbRef As WeakReference(Of TextBox)

    Public Sub New(textBox As TextBox)
        MyBase.New(textBox.Text)
        ' only hold a weak reference to the TextBox
        ' to avoid any disposal issues
        tbRef = New WeakReference(Of TextBox)(textBox)
    End Sub

    Public ReadOnly Property SourceTextBox As TextBox
        Get
            Dim ret As TextBox = Nothing
            tbRef.TryGetTarget(ret)
            Return ret
        End Get
    End Property
End Class

Now your original code could be written as:

Imports System.Speech.Synthesis

Public Class Form1
    Private WithEvents _Synth As New SpeechSynthesizer

    Private Sub TextBox1_KeyUp(sender As Object, e As KeyEventArgs) Handles TextBox1.KeyUp
        If e.KeyCode = Keys.Enter Then
            ' use a custom prompt to store the TextBox
            _Synth.SpeakAsync(New MyPrompt(Me.TextBox1))
        End If
    End Sub

    Private Sub _Synth_SpeakProgress(sender As Object, e As SpeakProgressEventArgs) Handles _Synth.SpeakProgress
        Dim mp As MyPrompt = TryCast(e.Prompt, MyPrompt)
        If mp IsNot Nothing Then
            Dim tb As TextBox = mp.SourceTextBox
            If tb IsNot Nothing Then
                ' set the selection in the source TextBox
                tb.SelectionStart = e.CharacterPosition
                tb.SelectionLength = e.CharacterCount
            End If
        End If
    End Sub

End Class

Edit:

The OP wants to use this with the SpeakSsmlAsync method. That in itself is not possible as that method creates a base Prompt using the Prompt(String, SynthesisTextFormat) Constructor and returns the created Prompt after calling SpeechSynthesizer.SpeakAsync(created_prompt).

Below is a derived Prompt class that accepts either a string of ssml or a PromptBuilder instance along with an integer identifier. A new version of MyPrompt to use ssml and an integer identifer.

Imports System.Speech.Synthesis

Public Class MyPromptV2 : Inherits Prompt
    Public Sub New(ssml As String, identifier As Int32)
        MyBase.New(ssml, SynthesisTextFormat.Ssml)
        Me.Identifier = identifier
    End Sub

    Public Sub New(builder As PromptBuilder, identifier As Int32)
        MyBase.New(builder)
        Me.Identifier = identifier
    End Sub

    Public ReadOnly Property Identifier As Int32
End Class

...

Imports System.Speech.Synthesis

Public Class Form1
    Private WithEvents _Synth As New SpeechSynthesizer

    Private Sub TextBox1_KeyUp(sender As Object, e As KeyEventArgs) Handles TextBox1.KeyUp
        If e.KeyCode = Keys.Enter Then
            ' build some ssml from the text
            Dim pb As New PromptBuilder
            pb.AppendText(TextBox1.Text)
            ' use ssml and and integer
            _Synth.SpeakAsync(New MyPrompt(pb.ToXml, 10))
            ' or 
            '_Synth.SpeakAsync(New MyPrompt(pb, 10))
        End If
    End Sub

    Private Sub _Synth_SpeakProgress(sender As Object, e As SpeakProgressEventArgs) Handles _Synth.SpeakProgress
        Dim mp As MyPromptV2 = TryCast(e.Prompt, MyPromptV2)
        If mp IsNot Nothing Then
            Select Case mp.Identifier
                Case 10
                    TextBox1.SelectionStart = e.CharacterPosition
                    TextBox1.SelectionLength = e.CharacterCount
            End Select
        End If
    End Sub
End Class
TnTinMn
  • 11,522
  • 3
  • 18
  • 39
  • Your answer is really elegant. Could you please show how one could introduce an integer (for "ID") in the MyPrompt class and NOT use a textbox? I have tried to do that, but it wouldn't compile as _Synth.SpeakSsmlAsync wouldn't accept "MyPrompt". My line was _Synth.SpeakSsmlAsync(New MyPrompt(iMyNewID, MyStringBuilderToCompileSsml.ToString)). – tmighty Apr 12 '19 at 16:43
  • @tmighty, `SpeakSsmlAsync` is just a helper method that creates a `Prompt` using the `New Prompt(textToSpeak, SynthesisTextFormat.Ssml)` constructor and then it calls `SpeakAsync(prompt)` . There is no way to modify the `Prompt` that that method creates. A constructor overload can be added to `MyPrompt` to accept ssml though as well as your integer value. I will edit the post shortly. – TnTinMn Apr 12 '19 at 17:01
  • OMG, thank you so much for your profound answer. Did you program that part of the framework, or are you just a monster? :-) – tmighty Apr 12 '19 at 18:08
  • 1
    @tmighty, no I'm not to blame for the framework; I guess that leaves the second option you provided. :) However, I prefer the term [ogre](https://en.wikipedia.org/wiki/Ogre). – TnTinMn Apr 12 '19 at 19:14
0

There is one another way to get which sentence is currently processing. You could assign choice numbers to your sentences and then you can recognize the speech by getting the index of that sentence; which you can further process conditions. Use SpeechRecognizedEventArgs argument of SpeechRecognized method for getting the sentence index.

void sre_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
  string txt = e.Result.Text;
  int sentenceIndex = txt.IndexOf("My Sentence");

  if (sentenceIndex >= 0)
  {
    Console.WriteLine("Currently Speaking Sentence: My Sentence, with index number: " 
                 + sentenceIndex);
  }

  //.... some code here
}

Follow full example here.


Edit 1:

The class-scope SpeechSynthesizer object gives the application the ability to speak. The SpeechRecognitionEngine object allows the application to listen for and recognize spoken words or phrases.

  • Thank you, but my question is about SpeechSynthesis, not SpeechRecognition. – tmighty Apr 11 '19 at 14:52
  • Did you follow the full example? See Edit. –  Apr 11 '19 at 15:42
  • Yes, I did, there was nothing in that "full example" article that would contribute to my question. Also, I think, you're confusing what "int sentenceIndex" does in the example that you've stated. But thank you anyways. – tmighty Apr 11 '19 at 20:20