Questions tagged [microsoft-speech-api]

The Microsoft Speech API (SAPI) provides a high-level interface between an application and speech engines. SAPI implements all the low-level details needed to control and manage the real-time operations of various speech engines.

The Microsoft Speech API (SAPI) provides a high-level interface between an application and speech engines. SAPI implements all the low-level details needed to control and manage the real-time operations of various speech engines.

The two basic types of SAPI engines are text-to-speech (TTS) systems and speech recognizers. TTS systems synthesize text strings and files into spoken audio using synthetic voices. Speech recognizers convert human spoken audio into readable text strings and files.

API for Text-to-Speech

Applications can control text-to-speech (TTS) using the ISpVoice Component Object Model (COM) interface. Once an application has created an ISpVoice object (see Text-to-Speech Tutorial), the application only needs to call ISpVoice::Speak to generate speech output from some text data.

In addition, the IspVoice interface also provides several methods for changing voice and synthesis properties such as speaking rate ISpVoice::SetRate, output volume ISpVoice::SetVolume and changing the current speaking voice ISpVoice::SetVoice.

API for Speech Recognition

Just as ISpVoice is the main interface for speech synthesis, ISpRecoContext is the main interface for speech recognition. Like the ISpVoice, it is an ISpEventSource, which means that it is the speech application's vehicle for receiving notifications for the requested speech recognition events.

Source:http://msdn.microsoft.com/en-us/library/ee125077(v=vs.85).aspx

82 questions
0
votes
1 answer

How to convert MultipartFile properly to match for Microsoft AudioConfig?

In my Spring Boot application, I accept an audiofile as MultipartFile with @RequestParam. I know, I can convert the file into some InputStream. I am also able to convert it into some byte array: @PostMapping("/microsoft") void…
0
votes
1 answer

Does MS speech support custom vocabulary

I have a requirement to write an application which would take an audio file and identity precisely at which points in the file specific words are being spoken. These are not English words, but rather Aramaic words, so would have to be added as…
user1052610
  • 4,440
  • 13
  • 50
  • 101
0
votes
0 answers

Cannot assign value of type '(Type1?, Type2?)' to type 'TypeDef' (aka '(Type1, Type2) -> ()')

I am looking at implementing a swift version of some objective C code. I have a Typedef with the declaration: void (^)(Class1 * _Nonnull, Class2 * _Nonnull) The class declarations are: @class Class1 : NSObject; @class Class2 : NSObject; Class2 has…
FerasAS
  • 273
  • 2
  • 14
0
votes
2 answers

How to hit Microsoft language detection API through POSTMAN

I am running Language Detection Cognitive Service API locally(by using below command). docker run --rm -it -p 5003:5003 --memory 1g --cpus 1 mcr.microsoft.com/azure-cognitive-services/speechservices/language-detection Eula=accept…
0
votes
1 answer

Using data from server instead of file to transcribe for Microsoft azure speech SDK

I am trying to send data to azure speech SDK to transcribe. I want it to receive data from a python file, put in a buffer and then transcribe continuously. I am using this sample from azure speech SDK. def…
0
votes
1 answer

E2247 'TSpVoice::Voice' is not accessible

This use to work on Windows 10, but now it doesn't, and I can't find how to correct the code to make it work. void __fastcall TForm1::ComboBox1Change(TObject *Sender) { ISpeechObjectTokenPtr token; ISpeechObjectTokensPtr tokens; …
Spider
  • 1
  • 1
0
votes
1 answer

Tenant Level Search For Outlook via Rest Api

Is there a way to search across all mailboxes in a tenant ? without specifying any particular user? My goal is to search for any test across all mailboxes of a tenant. I came across this link a :…
0
votes
1 answer

Issue with Microsoft chat bot giving double responses to a question instead of just 1 time that i have instructed it to do

I am currently using micro soft's speech studio to create a simple chat bot. For all my questions, I need to add a confirmation rule to ask if they need further assistance getting to the location they are looking for. However after it gets to the…
0
votes
1 answer

Speech Service Authentication Issue on Bot Framework V4

Getting the following error while trying to get a token from Azure Speech Service. 'https://brazilsouth.api.cognitive.microsoft.com/sts/v1.0/issuetoken 401 (Access Denied)'. Here is the way I'm requesting the token via JavaScript: const res = await…
0
votes
1 answer

how to use azure speaker recognition api in python

I am using python 2.7. I want to use speaker identification. I want to know how can i Enroll and get profile identified. Can anyone send me code and module required to be downloaded. It would work by changing audio and subscription key. thank you
0
votes
2 answers

Microsoft Azure Speech SDK - Disable Audio Logging

I am trying to use Microsoft Azure Speech to text service. I have a working example using the Python Quickstarts. But I am wondering if Microsoft saves the audio files and if there is anyway to opt-out of it? Any thoughts will be much appreciated. I…
0
votes
1 answer

Request for higher concurrency for Speech-to-text

I am a developer at Across Cultures - we provide online EAL (English as an Additional Language) support for learners in schools. I've been looking at your Speech Services API and have something working for our requirements, however we will need…
0
votes
1 answer

Implementing a TTS service for Windows 10

I'm working on a research project in which we create a new text-to-speech (TTS) engine, that converts text to spoken audio. As the engine is already performing good, we try to make it usable by a large number of applications which made us want the…
0
votes
3 answers

Russian language recognition in Microsoft Speech API

I would like to play a little bit with Microsoft Speech API. I have found this answer and it works! I have tried to adopt it to recognize Russian language. The grammar file look like this:
Lex Sergeev
  • 231
  • 1
  • 3
  • 17
0
votes
0 answers

Microsoft Speech Synthesis, SpeakSsmlAsync with out lang attribute

I am using Microsoft Speech Synthesis to play my SSML string using public Prompt SpeakSsmlAsync(string ssmlText); and i got a requirement where i should not use xml:langbut when i remove xml:lang attribute from the ssml string i am getting below…