Questions tagged [speech-to-text]

The translation of spoken words into text. Possible synonyms include automatic speech recognition, ASR, computer speech recognition, speech to text, STT.

2372 questions
6
votes
2 answers

Using gcloud speech api for real-time speech recognition in dart, flutter

I want to use Google's real-time speech recognition api in a flutter project, written in dart. I've activated a gcloud account, created the api key (which should be the only necessary authentication method for google speech) and written a basic apk…
anarchy
  • 61
  • 1
  • 2
6
votes
3 answers

Improve Speech Recognition, C#

I use System.Speech library to able to recognize speech but it usually recognizes very different. SpeechRecognizer_rec = new SpeechRecognizer(); DictationGrammar grammar = new DictationGrammar(); grammar.SpeechRecognized += new…
Kaan
  • 902
  • 2
  • 16
  • 38
6
votes
2 answers

Module 'google.cloud.speech_v1p1beta1.types' has no 'RecognitionAudio' member

Trying to run the sample code and I am getting this error" Module 'google.cloud.speech_v1p1beta1.types' has no 'RecognitionAudio' member Env: python3x, linux, installed and updated google-cloud lib pip install --upgrade…
Stryker
  • 5,732
  • 1
  • 57
  • 70
6
votes
0 answers

How can I get Speech Recognition to stop listening after one word?

I am making a spoken PygLatin translator. A very basic one at least that tells one translated word. If I speak more than one word, it get's all jumbled up and takes the first character in the first word and puts it onto the last spoken word with…
6
votes
0 answers

SFSpeechRecognizer - detect voice language out of two

I'm developing a voice app for ios, with the ability to use two languages simultaneously. I'm planning to use the SFSpeechRecognizer, however, it requires a language locale to be preset. It can't detect automatically which one is used. So how can I…
6
votes
1 answer

Google speech to text integration in swift

I am developing a application that the voice as input and must give the text as the output and it is an iOS app and previously i developed the app through the Siri kit and implemented it. But problem is that i am not getting correct output as i…
Harika jetti
  • 61
  • 1
  • 4
6
votes
2 answers

How to Receive data from StartContinuousRecognitionAsync() of Microsoft Cognitive speech client library

Not able to find how to get data from StartContinuousRecognitionAsync() as I want to Receive data So that i can process the data only after a keyword.
shubhamsjmit
  • 81
  • 2
  • 6
6
votes
2 answers

Google cloud speech to text in Python: Save translation and time to JSON

I am using the standard solution to do speech to text processing with time stamps (see code below). I know from this post that it is possible to add arguments to the gcloud commandline tool, like --format=json. General question: How do I specify…
tmo
  • 1,393
  • 1
  • 17
  • 47
6
votes
4 answers

Google Cloud Speech-to-Text (MP3 to text)

I am using Google Cloud Platform Speech-to-Text API trial account service. I am not able to get text from an audio file. I do not know what exact encoding and sample Rate Hertz I should use for MP3 file of bit rate 128kbps. I tried various options…
Vikash Patel
  • 61
  • 1
  • 3
6
votes
2 answers

pyspeech (python) - Transcribe mp3 files?

I'd like to transcribe mp3 (speech-to-text) using the pyspeech API. I don't know if this is possible, though. Is it? How?
Pauly Dee
  • 441
  • 1
  • 7
  • 16
6
votes
1 answer

Watson Speech to Text unable to transcode data stream audio/wav

I am using the IBM Watson Speech to Text API: var SpeechToTextV1 = require('watson-developer-cloud/speech-to-text/v1'); var fs = require('fs'); var request = require('request'); var speech_to_text = new SpeechToTextV1({ "username": "
aginsburg
  • 1,223
  • 1
  • 12
  • 22
6
votes
2 answers

Google speech API throws Invalid audio channel count

Google Speech API throws Invalid audio channel count Exception for the audio recorded on a Mac machine. I'm just using the sample application provided by Google. com.google.api.gax.grpc.ApiException: io.grpc.StatusRuntimeException: INVALID_ARGUMENT:…
Vijay Innamuri
  • 4,242
  • 7
  • 42
  • 67
6
votes
3 answers

Watson Conversation in a live phone call

Can someone show me how to use Watson Conversation and other services (e.g. Twilio) to make a live phone call and carry on a conversation? I am able to use Watson Conversation, Twilio, and NodeRED to carry a conversation with a chatbot over SMS. I…
kane
  • 5,465
  • 6
  • 44
  • 72
6
votes
4 answers

Google-speech-api transcribing spoken numbers incorrectly

I started using google speech api to transcribe audio. The audio being transcribed contains many numbers spoken one after the other. E.g. 273 298 But the transcription comes back 270-3298 My guess is that it is interpreting it as some sort of phone…
6
votes
2 answers

Python speech recognition error converting mp3 file

My first try on audio to text. import speech_recognition as sr r = sr.Recognizer() with sr.AudioFile("/path/to/.mp3") as source: audio = r.record(source) When I execute the above code, the following error…
Yogaraj
  • 322
  • 1
  • 4
  • 17