11

I'm trying to change the pitch of spoken text via SSML and the .NET SpeechSynthesizer (System.Speech.Synthesis)

SpeechSynthesizer synthesizer = new SpeechSynthesizer();
PromptBuilder builder = new PromptBuilder();
builder.AppendSsml(@"C:\Users\me\Documents\ssml1.xml");
synthesizer.Speak(builder);

The content of the ssml1.xml file is:

<?xml version="1.0" encoding="ISO-8859-1"?>
<ssml:speak version="1.0"
xmlns:ssml="http://www.w3.org/2001/10/synthesis"
xml:lang="en-US">
<ssml:sentence>
Your order for <ssml:prosody pitch="+30%" rate="-90%" >8 books</ssml:prosody>
will be shipped tomorrow.
</ssml:sentence>
</ssml:speak>

The rate is recognized: "8 books" is speaken much slower than the rest, but no matter what value is set for "pitch", it makes no difference ! Allowed values can be found here:

http://www.w3.org/TR/speech-synthesis/#S3.2.4

Am I missing something or is changing the pitch just not supported by the Microsoft Speech engine ?

fritz

Oded
  • 489,969
  • 99
  • 883
  • 1,009
fritz
  • 377
  • 1
  • 5
  • 11

1 Answers1

2

While the engine SsmlParser used by System.Speech accepts a pitch attribute in the ProcessProsody method, it does not process it.

It only processes the range, rate, volume and duration attributes. It also parses contour but is processed as range (not sure why)...

Edit: if you don't really need to read the text from a SSML xml file, you can create the text programatically.
Instead of

builder.AppendSsml(@"C:\Users\me\Documents\ssml1.xml");

use

builder.Culture = CultureInfo.CreateSpecificCulture("en-US");
builder.StartVoice(builder.Culture);
builder.StartSentence();

builder.AppendText("Your order for ");

builder.StartStyle(new PromptStyle() { Emphasis = PromptEmphasis.Strong, Rate = PromptRate.ExtraSlow });
builder.AppendText("8 books");
builder.EndStyle();

builder.AppendText(" will be shipped tomorrow.");

builder.EndSentence();
builder.EndVoice();
Jaroslav Jandek
  • 9,463
  • 1
  • 28
  • 30
  • I wonder if there is any other speech engine api that can be used with .net and which processes pitch commands ? – fritz Feb 13 '11 at 10:35
  • @fritz: there are not many .NET APIs. There are many native APIs, not many of them are "free", though. I have used **eSpeak** (not .NET) with success - better output than from `System.Speech` but it did not read SSML well. – Jaroslav Jandek Feb 13 '11 at 11:21
  • Is there a way to sing with .NET speech or any alternative? I'm looking for an API that supports the three features of control: 1) Speech 2) Accurate stable pitch 3) Duration control. Is there such a thing? I obviously prefer a musically-driven API. – Shimmy Weitzhandler Oct 05 '14 at 02:36