The translation of spoken words into text. Possible synonyms include automatic speech recognition, ASR, computer speech recognition, speech to text, STT.
Questions tagged [speech-to-text]
2372 questions
6
votes
2 answers
Using gcloud speech api for real-time speech recognition in dart, flutter
I want to use Google's real-time speech recognition api in a flutter project, written in dart.
I've activated a gcloud account, created the api key (which should be the only necessary authentication method for google speech) and written a basic apk…

anarchy
- 61
- 1
- 2
6
votes
3 answers
Improve Speech Recognition, C#
I use System.Speech library to able to recognize speech but it usually recognizes very different.
SpeechRecognizer_rec = new SpeechRecognizer();
DictationGrammar grammar = new DictationGrammar();
grammar.SpeechRecognized += new…

Kaan
- 902
- 2
- 16
- 38
6
votes
2 answers
Module 'google.cloud.speech_v1p1beta1.types' has no 'RecognitionAudio' member
Trying to run the sample code and I am getting this error"
Module 'google.cloud.speech_v1p1beta1.types' has no 'RecognitionAudio' member
Env: python3x, linux, installed and updated google-cloud lib
pip install --upgrade…

Stryker
- 5,732
- 1
- 57
- 70
6
votes
0 answers
How can I get Speech Recognition to stop listening after one word?
I am making a spoken PygLatin translator.
A very basic one at least that tells one translated word.
If I speak more than one word, it get's all jumbled up and takes the first character in the first word and puts it onto the last spoken word with…

Anwar Azeez
- 101
- 4
6
votes
0 answers
SFSpeechRecognizer - detect voice language out of two
I'm developing a voice app for ios, with the ability to use two languages simultaneously. I'm planning to use the SFSpeechRecognizer, however, it requires a language locale to be preset. It can't detect automatically which one is used.
So how can I…

Ilia Kandrashou
- 89
- 2
6
votes
1 answer
Google speech to text integration in swift
I am developing a application that the voice as input and must give the text as the output and it is an iOS app and previously i developed the app through the Siri kit and implemented it.
But problem is that i am not getting correct output as i…

Harika jetti
- 61
- 1
- 4
6
votes
2 answers
How to Receive data from StartContinuousRecognitionAsync() of Microsoft Cognitive speech client library
Not able to find how to get data from StartContinuousRecognitionAsync() as I want to Receive data So that i can process the data only after a keyword.

shubhamsjmit
- 81
- 2
- 6
6
votes
2 answers
Google cloud speech to text in Python: Save translation and time to JSON
I am using the standard solution to do speech to text processing with time stamps (see code below). I know from this post that it is possible to add arguments to the gcloud commandline tool, like --format=json.
General question: How do I specify…

tmo
- 1,393
- 1
- 17
- 47
6
votes
4 answers
Google Cloud Speech-to-Text (MP3 to text)
I am using Google Cloud Platform Speech-to-Text API trial account service. I am not able to get text from an audio file. I do not know what exact encoding and sample Rate Hertz I should use for MP3 file of bit rate 128kbps. I tried various options…

Vikash Patel
- 61
- 1
- 3
6
votes
2 answers
pyspeech (python) - Transcribe mp3 files?
I'd like to transcribe mp3 (speech-to-text) using the pyspeech API. I don't know if this is possible, though.
Is it? How?

Pauly Dee
- 441
- 1
- 7
- 16
6
votes
1 answer
Watson Speech to Text unable to transcode data stream audio/wav
I am using the IBM Watson Speech to Text API:
var SpeechToTextV1 = require('watson-developer-cloud/speech-to-text/v1');
var fs = require('fs');
var request = require('request');
var speech_to_text = new SpeechToTextV1({
"username": "

aginsburg
- 1,223
- 1
- 12
- 22
6
votes
2 answers
Google speech API throws Invalid audio channel count
Google Speech API throws Invalid audio channel count Exception for the audio recorded on a Mac machine.
I'm just using the sample application provided by Google.
com.google.api.gax.grpc.ApiException: io.grpc.StatusRuntimeException: INVALID_ARGUMENT:…

Vijay Innamuri
- 4,242
- 7
- 42
- 67
6
votes
3 answers
Watson Conversation in a live phone call
Can someone show me how to use Watson Conversation and other services (e.g. Twilio) to make a live phone call and carry on a conversation?
I am able to use Watson Conversation, Twilio, and NodeRED to carry a conversation with a chatbot over SMS. I…

kane
- 5,465
- 6
- 44
- 72
6
votes
4 answers
Google-speech-api transcribing spoken numbers incorrectly
I started using google speech api to transcribe audio.
The audio being transcribed contains many numbers spoken one after the other.
E.g. 273 298
But the transcription comes back 270-3298
My guess is that it is interpreting it as some sort of phone…

Moshe Rayman
- 61
- 1
- 3
6
votes
2 answers
Python speech recognition error converting mp3 file
My first try on audio to text.
import speech_recognition as sr
r = sr.Recognizer()
with sr.AudioFile("/path/to/.mp3") as source:
audio = r.record(source)
When I execute the above code, the following error…

Yogaraj
- 322
- 1
- 4
- 17