Questions tagged [microsoft-speech-api]

The Microsoft Speech API (SAPI) provides a high-level interface between an application and speech engines. SAPI implements all the low-level details needed to control and manage the real-time operations of various speech engines.

The Microsoft Speech API (SAPI) provides a high-level interface between an application and speech engines. SAPI implements all the low-level details needed to control and manage the real-time operations of various speech engines.

The two basic types of SAPI engines are text-to-speech (TTS) systems and speech recognizers. TTS systems synthesize text strings and files into spoken audio using synthetic voices. Speech recognizers convert human spoken audio into readable text strings and files.

API for Text-to-Speech

Applications can control text-to-speech (TTS) using the ISpVoice Component Object Model (COM) interface. Once an application has created an ISpVoice object (see Text-to-Speech Tutorial), the application only needs to call ISpVoice::Speak to generate speech output from some text data.

In addition, the IspVoice interface also provides several methods for changing voice and synthesis properties such as speaking rate ISpVoice::SetRate, output volume ISpVoice::SetVolume and changing the current speaking voice ISpVoice::SetVoice.

API for Speech Recognition

Just as ISpVoice is the main interface for speech synthesis, ISpRecoContext is the main interface for speech recognition. Like the ISpVoice, it is an ISpEventSource, which means that it is the speech application's vehicle for receiving notifications for the requested speech recognition events.

Source:http://msdn.microsoft.com/en-us/library/ee125077(v=vs.85).aspx

82 questions
0
votes
1 answer

Does Cognitive service Speech SDK work in non ubuntu linux? If so, what are the required dependency?

As subject, I tried to follow the quick start quide to run the speech api in non-ubuntu linux (see below), but I wonder if anyone get it to work or it is just not supported cat /proc/version Linux version 4.14.77-70.82.amzn1.x86_64…
0
votes
2 answers

Microsoft Translator Speech missing punctuation

I am using MS Translator Speech WebSocket API for real-time speech recognition and translation. The problem is that sometimes the recognised text does not have punctuation (commas, full stops, etc.). The transcribed text looks good otherwise. I also…
0
votes
0 answers

Microsoft Bing Speech API using Python: No JSON Object

I'm trying to implement the Microsoft Bing Speech REST API with Python and I've found some code online. https://www.taygan.co/blog/2018/02/09/getting-started-with-speech-to-text import json import requests YOUR_API_KEY = 'ENTER_YOUR_KEY_HERE' …
anna
  • 1
  • 2
0
votes
1 answer

Adding Speech to Text module to C# bot

I need to add Speech to Text capability to an MS bot written in C#. I'm new to C# (although I do know C++) and was wondering if I can use JS for the same. I'm quite familiar with JavaScript and have written Speech to Text module using…
0
votes
2 answers

Microsoft Custom Speech Service issue when using web socket url

so recently for a work project I've been playing around with speech to text models and in particular custom speech to text models. With a bit of mixing and matching examples I've managed to get a test application to talk to the normal Bing speech to…
0
votes
1 answer

Inputting Audio Stream to FFMPEG

I’m building a real time chat application with C# and ffmpeg.exe. My requirement is to get a memory stream from Microsoft Speech API and feed it to ffmpeg process in real time. I can take a memory stream from Microsoft Speech API. I’m using…
Wijaya
  • 227
  • 1
  • 9
0
votes
1 answer

Issue in setting up Microsoft Bing speech recognition

I am trying to use Microsoft's Bing Speech Recognition service library. The following command has to be given in the cmd with arguments. But I have no idea in which format I should enter this command. I could not find it anywhere. Can someone help…
Kabilesh
  • 1,000
  • 6
  • 22
  • 47
0
votes
1 answer

NodeJs websocket client for Custom Speech Service

I would like to create a websocket client for the Custom Speech service using a programming language such as Java, NodeJs, Go. Where can I find some technical information on how to consume that websocket from scratch (the expected message, fields,…
0
votes
1 answer

Microsoft Speech API with Python requests?

I'm trying to use the requests package in Python to make a call to the Microsoft Bing Speech Transcription API. I can make the call work when I use Postman, but this requires manually selecting a file to upload (Postman provides a GUI to select the…
0
votes
1 answer

Change language for text-to-speech Microsoft.Speech.Synthesis

I found how to change gender, rate, and volume, but I'm wondering if it possible to change -or better if I can get ability to set- several different languages to choose for Microsoft.Speech.Synthesis text to speech. I just can't find useful…
user6922622
0
votes
1 answer

Microsoft Cognitive Speech-to-Text Service --- Choose Microphone

I am using Microsoft Cognitive Speech-to-Text service (MicrophoneRecognitionClient). Seems it uses the default microphone of my PC. If I have multiple microphones on my machine, and I want to specify which microphone to use, is there any way I can…
0
votes
0 answers

No recognizer of the required ID found.\r\nParameter name: culture

I am developing Speech Recognizing program using microsoft speech recognizer[Microsoft.Speech.Recognition] SpeechRecognitionEngine sre; private void button1_Click(object sender, EventArgs e) { sre = new SpeechRecognitionEngine(new …
0
votes
1 answer

C# speech recognition without predefined grammar

I'm trying to use speech recognition with C# application but this way I have only predicted phrase on output: sList.Add(new string[] { "hello", "test", "works", "exit"}); Can I get output same way like Google Speech Recognition API work,…
0
votes
1 answer

Java Web integrated with Bing Speech API

I am developing an application using Java Web with JSF and would like to integrate Speech API using JavaScript. What I did was insert the 'speech.1.0.0.js' file in my application and used the 'index.html' to test (informed the key and…
0
votes
0 answers

How to work with SpeechAPI5.* in Windows 10 with Delphi XE2 or higher?

I have example for SpeechAPI 5.4 in Delphi XE2. Here's part of it: try SpVoice:= TSpVoice.Create(nil); SOTokens := SpVoice1.GetVoices('', ''); for i := 0 to SOTokens.Count - 1 do begin SOToken := SOTokens.Item(I); SOToken._AddRef; …
0xFF
  • 585
  • 1
  • 6
  • 22