The translation of spoken words into text. Possible synonyms include automatic speech recognition, ASR, computer speech recognition, speech to text, STT.
Questions tagged [speech-to-text]
2372 questions
0
votes
0 answers
Node application issue when using google speech to text on centos server
I have a node application using google speech to text service on a centos server. Locally I have no problems but on the centos server I have this error when I start the registration:
sox warn alsa can't encode 0-bit unknown or not applicable
I tried…

dorino canciani
- 11
- 5
0
votes
0 answers
AttributeError: 'NoneType' object has no attribute 'read' raises while using AudioSegment get_file() command
I am currently working on a speech to text project. I tried the code in the following video
"https://www.youtube.com/watch?v=B5A1bMi4dJI", and received :
Line 29, in
clip = AudioSegment().from_file(data)
File…

Awrelo5
- 1
0
votes
0 answers
Android Java: Is there a way to to make the runtime for SpeechRecognizer.startListening() indefinite?
I am working in Android Studio (Java) to try to create a certain app, and a part of what I need this app to do is as follows: When the user clicks on the screen, the app starts listening for speech. Then, when they click on the screen again, the app…

Rikudou
- 145
- 6
0
votes
0 answers
Google Speech-to-text API
I want to use the Speech-to-text API for an application made in nocode.
I get this error message:
There was a problem setting up your call.
Raw response for API
401 status code
{
"Mistake": {
"code": 401,
"message": "Request had invalid…
0
votes
1 answer
How to train azure speech to text model on localhost
For demand, I can't use azure speech to text service on the cloud. Therefore, I use azure speech to text container with docker. I use two…

蔣鎧駿
- 9
- 1
0
votes
0 answers
Is there any way to extend the speech to text time using "SpeechRecognition - Web APIs"?
I'm creating an extension for Chrome that should use "speech to text". It should recognize a numbers that the user says. This works all correctly, although it has some limits...
Even if I have set recognition.continuous = true;, it turns off the…

Maroš Minčák
- 11
- 2
0
votes
0 answers
How to make a swifui button action repeat in a loop after one click of the button
I'm relatively new to swift and am I'm making a swiftui calling application with a deepfaked chatbot that requires me to transcribe the users speech to text and then play an appropriate response.
I currently have a working flow that that starts a…

fman03
- 1
- 2
0
votes
0 answers
How to get rid of google dialogue using speech to text in android studio?
TEST#
Is there a way to hide it? or remove it for popping pop?
0
votes
0 answers
Flutter speech to text plugin continuous listening problem
Currently, I am developing an application which can connect an Arduino toy car and control it by voice commands using Bluetooth. For a better user experience, I display all the commands in a chat application user interface, so the user can see the…

Vihanga Randunu
- 391
- 4
- 15
0
votes
0 answers
Invalid Argument Error / Graph Execut ion Error by [[{{node CTCLoss/CTCLoss}}]] [Op:__inference_train_function_66347]
Anyperson can solve this problem in audio .Thank you in advance

Zubair Ali
- 43
- 3
0
votes
0 answers
Account for mispronunciations and misunderstandings in Speech Recognition Transcript to the Script using Python?
I want to ultimately make a read-along program for children to increase their reading speed. I want them to read a paragraph and measure how long it took for them to read each sentence. So far, the speech recognition AI successfully turns their…

Electro
- 17
- 8
0
votes
0 answers
Speech to Text Model, where the model doesn't attempt to correct errors/grammer?
Is there an Vosk speech to text model, or any other open source/closed sourced model, where the model would output the spoken words into text. But it wouldn't correct them into proper words or fix their grammer, just output what they are saying in…

Deus
- 1
- 1
0
votes
0 answers
How can I tell when exactly a certain sound threshold is passed?
I'm trying to create a script that records video samples from a live stream, labels, and stores them.
the end goal for this task, is to generate annotated data for lip reading.
it means I have to know exactly when a word was started, and when it was…

ofer simchovitch
- 39
- 1
- 11
0
votes
0 answers
how to use google speech API to convert audio files to string?
I am trying to use Google Speech API to convert audio files to strings but I can't make it work when I try to transcribe an audio file that is uploaded by my form element.
0
votes
0 answers
Google Cloud Speech-to-Text Api - Convert json format in plain text w/o variables
I used the Google Cloud Speech-to-Text API to convert audio files (interviews) to text. This worked quite well, though I struggle with the json output.
Since I only need the transcript result ("Okay, I'm going to read you, the opening question."), I…

Matthias
- 1