Questions tagged [microsoft-speech-api]

The Microsoft Speech API (SAPI) provides a high-level interface between an application and speech engines. SAPI implements all the low-level details needed to control and manage the real-time operations of various speech engines.

The Microsoft Speech API (SAPI) provides a high-level interface between an application and speech engines. SAPI implements all the low-level details needed to control and manage the real-time operations of various speech engines.

The two basic types of SAPI engines are text-to-speech (TTS) systems and speech recognizers. TTS systems synthesize text strings and files into spoken audio using synthetic voices. Speech recognizers convert human spoken audio into readable text strings and files.

API for Text-to-Speech

Applications can control text-to-speech (TTS) using the ISpVoice Component Object Model (COM) interface. Once an application has created an ISpVoice object (see Text-to-Speech Tutorial), the application only needs to call ISpVoice::Speak to generate speech output from some text data.

In addition, the IspVoice interface also provides several methods for changing voice and synthesis properties such as speaking rate ISpVoice::SetRate, output volume ISpVoice::SetVolume and changing the current speaking voice ISpVoice::SetVoice.

API for Speech Recognition

Just as ISpVoice is the main interface for speech synthesis, ISpRecoContext is the main interface for speech recognition. Like the ISpVoice, it is an ISpEventSource, which means that it is the speech application's vehicle for receiving notifications for the requested speech recognition events.

Source:http://msdn.microsoft.com/en-us/library/ee125077(v=vs.85).aspx

82 questions
2
votes
0 answers

SetInputToDefaultAudioDevice throws 'System.InvalidOperationException' occurred in Microsoft.Speech.dll

Executing this code on Windows 10 Pro with microphone enabled throws an exception. Any idea why? using Microsoft.Speech.Recognition; ... static void Main(string[] args) { // Create a SpeechRecognitionEngine object for the default…
Artur Kedzior
  • 3,994
  • 1
  • 36
  • 58
2
votes
0 answers

System.Speech.Recognition accuracy with DictationGrammar

hello I'm trying to find free and useful speech recognition for C# windows application. I've tried System.Speech.Recognition; but if phrase or word is not pre-recorded and I want use DictationGrammar sometimes I have to say 20 times same phrase or…
2
votes
2 answers

Does Microsoft SAPI support speech recognition on offline mode just like system.speech api?

I have read official documentation of Microsoft SAPI but I couldn't find about whether the api can be used on offline mode or not. in there, they said that Microsoft SAPI is server based speech recognition api. So It seems like it doesn't support…
2
votes
0 answers

Detect word by prefix only in SRGS grammar

I'm using an SRGS grammar to refine accuracy of a Microsoft Speech STT service. I have specific needs for this grammar because I would like it to match some words by prefix only, but get the whole word as a result of detection. This is the kind of…
Cécile Fecherolle
  • 1,695
  • 3
  • 15
  • 32
2
votes
0 answers

How to create language for microsoft speech recognizer

I am developing a Speech Recognition Application in C# by using Microsoft Speech API. I need a speech recognizer for the following language : uz-Latn-UZ The Speech Recognition Language does not exists for this language. So I wanna create my own…
2
votes
1 answer

Use hindi or kannada language for microsoft speech sdk

Is there any language packs available in Hindi or Kannada for Microsoft Speech SDK. Hindi and Kannada are languages spoken in India.
Satyajit
  • 1,971
  • 5
  • 28
  • 51
1
vote
0 answers

How Can I use the output of recognized speech to implement into PyWebIo (WebApp framework ) (using put(text))

I am developing a webApp using PyWebio which is basically a speech translation/transcription tool. I created the main outlay which looks basically this: Main Page Design Here, I used the put_scrollable function to create a text…
1
vote
1 answer

Microsoft Speech To text (SPXERR_AUDIO_SYS_LIBRARY_NOT_FOUND)

I have used the algorithm speech to text in my project, without docker, and my project runs beautifully but when I add to my project a dockerfile and compose, the algorithm speech to text stopped working giving me this error: Traceback (most recent…
1
vote
2 answers

Does Azure's Speech to Text service accept Webm audio and does it offer an output with timestamps?

I'm trying to decide whether Azure is the best platform for my transcription needs. I have two questions -- does Azure's Speech to Text service: Accept Webm audio as input? Does it offer an output with timestamps?
1
vote
1 answer

Microsoft Speech Synthesizer Lexicon not working

I have followed the example here for adding a custom lexicon to my speech SSML. However, it is being ignored. I tried it with my own lexicon and also with the sample. At first the sample seemed to work, but when I removed the lexicon it still…
1
vote
1 answer

passing loaded variable as argument instead of filepath python

I'm not too familiar with python, apologies if this is too trivial question I have a script that get an audio file from an Url, I need to convert the file from .ogg type to .wav Then I want to pass the converted and loaded file, to a function that…
1
vote
1 answer

on deploy, System.Speech.dll object not getting set to an instance of an object

It is working on local system but not working on live server. I am getting error: NullReferenceException: Object reference not set to an instance of an object.] System.Speech.Internal.ObjectTokens.SAPICategories.DefaultDeviceOut() +79 …
1
vote
2 answers

Microsoft.CognitiveServices.Speech.Core.dll not found exception

I'm developing hybrid xamarin forms application with cognitive services from microsoft azure on .Net standard 2.0. Got nuget packages of CognitiveServices 0.1.0, Microsoft.CognitiveServices.Speech 1.3.1 along with Xamarin.Forms 3.5.0. Nuget packages…
bala
  • 11
  • 2
1
vote
0 answers

SAPI Symbol Usage for Speech Dictionary Input

I've been doing some work to add words and pronunciations to the Windows speech dictionary via the SpLexicon Interface of SAPI 5.4 (which I think is the only way to do it) via the AddPronunciation function, or in my case: // Initialize SpLexicon…
1
vote
0 answers

Bing Speech to text end of recognition timeout

I am currently working Microsoft Bing Speech to Text api. I would like to stop the audio listening on silence of n seconds. Is this possible?