8

I am writing a simple spelling test app using the HTML5 SpeechSynthesis API. The text I would like my app to say is something like the following: "The spelling word is Cat. The cat chased the dog.".

The API tends to race without much of a pause from the first sentence to the second. I wonder if there is a way to insert a bit of a pause between the 2 sentences. I realize I could create 2 separate utterances and use the pause() call. However the code would be simpler and less brittle if I could simply insert grammatical hints.

Normally in spoken English, one tends to pause a little longer between paragraphs. So I inserted a newline character in my text, but there was no noticeable impact.

I also tried using an ellipsis.

Is there any way to do this or am I stuck breaking everything into separate utterances?

Bob Woodley
  • 1,246
  • 15
  • 30

4 Answers4

6

Split your text using comma (or custom delimiter) and add your own space using a timeout.

Here is a simple example as a proof-of-concept. Extending it, you can customize your text to include hints as to how long to pause.

function speakMessage(message, PAUSE_MS = 500) {
  try {
    const messageParts = message.split(',')

    let currentIndex = 0
    const speak = (textToSpeak) => {
      const msg = new SpeechSynthesisUtterance();
      const voices = window.speechSynthesis.getVoices();
      msg.voice = voices[0];
      msg.volume = 1; // 0 to 1
      msg.rate = 1; // 0.1 to 10
      msg.pitch = .1; // 0 to 2
      msg.text = textToSpeak;
      msg.lang = 'en-US';

      msg.onend = function() {
        currentIndex++;
        if (currentIndex < messageParts.length) {
          setTimeout(() => {
            speak(messageParts[currentIndex])
          }, PAUSE_MS)
        }
      };
      speechSynthesis.speak(msg);
    }
    speak(messageParts[0])
  } catch (e) {
    console.error(e)
  }
}


function run(pause) {
  speakMessage('Testing 1,2,3', pause)
}
<button onclick='run(0)'>Speak No Pause</button>
<button onclick='run(500)'>Speak Pause</button>
<button onclick='run(1000)'>Speak Pause Longer</button>
Steven Spungin
  • 27,002
  • 5
  • 88
  • 78
  • Solved exactly my problem. Thums up! – jerik Jan 30 '21 at 15:12
  • Thanks a lot. This solved it in my case. The accepted answer didn't work for me with Firefox and Mac in 2021. But this one works like a charm. – Jorge Feb 23 '21 at 20:17
5

Using an exclamation point "!" adds a nice delay for some reason.

You can chain them together with periods to extend the pause.

"Example text! . ! . ! . !"
Myka
  • 77
  • 1
  • 9
3

Just insert

<silence msec="5000" />

in the text for 5 sec waiting (Source).

Disclaimer: This code works only in an appropriate user agent.

// code taken from https://richjenks.com/dev/speechsynthesis/
var utterance  = new SpeechSynthesisUtterance(),
    speak      = document.getElementById("speak"),
    text       = document.getElementById("text");

// Delay links and events because speechSynthesis is funny
speechSynthesis.getVoices();
setTimeout(function () {
    // Add event listeners
    var voiceLinks = document.querySelectorAll(".voice");
    for (var i = 0; i < voiceLinks.length; i++) {
        voiceLinks[i].addEventListener("click", function (event) {
            utterance.voice = speechSynthesis.getVoices()[this.dataset.voice];
        });
    }
}, 100);

// Say text when button is clicked
speak.addEventListener("click", function (event) {
    utterance.text = text.value;
    speechSynthesis.speak(utterance);
});
<textarea id="text" rows="5" cols="50">Hi <silence msec="2000" /> Flash!</textarea>
<br>
<button id="speak">Speak</button>
Nina Scholz
  • 376,160
  • 25
  • 347
  • 392
  • 3
    Does this work for you? For me, it speaks the tag as well as the text. Does it work on the ttps://richjenks.com/dev/speechsynthesis/ site for you? Also, I can't find any documentation on the tag. Did you find some? – Bob Woodley Jan 31 '16 at 18:57
  • I just upgraded to version 48 and it worked. Version 47 does not. Which means I'll have to test for version number. :( Anyhow, thanks for the answer. Wish I could find some docs on it. – Bob Woodley Jan 31 '16 at 19:03
  • please see source above. – Nina Scholz Jan 31 '16 at 19:04
  • How did you even find out about this tag? It is not in the SSML spec: https://www.w3.org/TR/speech-synthesis/ – Bob Woodley Jan 31 '16 at 19:56
  • a friend of mine was yesterday experimenting with balabolka tts and in the help system, there was the hint for the tag for silence. – Nina Scholz Jan 31 '16 at 20:50
  • 4
    Not working on Chrome 55.0 - it reads "Hi less than silence" etc. Does anyone have a solution for this yet? I've tried all combinations of spaces and dots... – Sanjay Manohar Jan 09 '17 at 00:20
  • @SanjayManohar, right, the actual browser was at the time not available. the given solution works with the ancient browser version and good enough to get an accept mark. what should i actually do? writing api stuff for the future? – Nina Scholz Jun 02 '17 at 16:30
1

I’ve found inserting synthetic pauses using commas to be quite useful (as an making other manipulations). Here’s a little excerpt:

var speech = new SpeechSynthesisUtterance(),
    $content = document.querySelector('main').cloneNode(true),
    $space = $content.querySelectorAll('pre'),
    $pause_before = $content.querySelectorAll('h2, h3, h4, h5, h6, p, li, dt, blockquote, pre, figure, footer'),
    $skip = $content.querySelectorAll('aside, .dont_read');

// Don’t read
$skip.forEach(function( $el ){
    $el.innerHTML = '';
});

// spacing out content
$space.forEach(function($el){
    $el.innerHTML = ' ' + $el.innerHTML.replace(/[\r\n\t]/g, ' ') + ' ';
});

// Synthetic Pauses
$pause_before.forEach(function( $el ){
    $el.innerHTML = ' , ' + $el.innerHTML;
});

speech.text = $content.textContent;

The key is to clone the content node first so you can work with it in memory rather than manipulating the actual content. It seems to work pretty well for me and I can control it in the JavaScript code rather than having to modify the page source.

Aaron Gustafson
  • 602
  • 6
  • 11