The translation of spoken words into text. Possible synonyms include automatic speech recognition, ASR, computer speech recognition, speech to text, STT.
Questions tagged [speech-to-text]
2372 questions
5
votes
3 answers
Transcribing mp3 to text (python) --> "RIFF id" error
I am trying to turn mp3 file to text, but my code returns the error outlined below. Any help is appreciated!
This is a sample mp3 file. And below is what I have tried:
import speech_recognition as sr
print(sr.__version__)
r =…

Andrew
- 73
- 1
- 9
5
votes
4 answers
Expo Voice Recognition
I've been trying to implement Voice recognition on my Expo app, I've tried using a speech-to-text library called react-native-voice but it does not support Expo. Does anyone know any other library that I can use. I have read some articles on using…

Akash Ram
- 71
- 1
- 4
5
votes
1 answer
Subtitles/captions with Microsoft Azure Speech-to-text in Python
I've been trying to figure out how to make subtitles with Microsoft Azure Speech Recognition service in Python, but can't figure it out. I've followed the tips someone else has answered here on getting the individual words, but even formatting those…

Python Dong
- 85
- 1
- 5
5
votes
0 answers
Alternative to Deprecated WebSpeech API Function
I need to work with audio, and I need to use the following line of code after creating an AudioContext:
audioContext.createScriptProcessor(4096, 1, 1);
Although this still works, I see that this is deprecated and replaced:
Deprecated
This feature…

Developer
- 103
- 1
- 5
5
votes
2 answers
API or SDK to make speech recognition only for numbers (between 1 and 10000)?
I need a specialized solution optimized to detect numbers between 1 and 1000 to be used on a smartphone.
Best solution would be to have this SDK working offline.
Any idea ?
I do not find any configuration with Google Speech or Amazon Transcribe to…

fvisticot
- 7,936
- 14
- 49
- 79
5
votes
2 answers
Transcribe an Audio File in Python
I'm trying to transcribe an audio file which is bit large. It's properties are as follows.
Size : 278.3 MB
Duration : 52 minutes
Format : WAV
Follwoing is my code which I used to convert it having 60 second durations. Could you please advice to…

Nilani Algiriyage
- 32,876
- 32
- 87
- 121
5
votes
1 answer
Android 2.2: Where is the option for speech input in the emulator?
My Nexus One has it:
Settings includes a "Voice
recognizer settings" in the list of
"Voice input & output settings".
Google Search has a microphone
button next to it, so when I touch
it, a dialog prompts me to say what
I want to search.
On the…

srf
- 2,410
- 4
- 28
- 41
5
votes
1 answer
Google Cloud Platform: Speech to Text Conversion of Large Media Files
I'm trying to extract text from mp4 media file downloaded from youtube. As I'm using google cloud platform so thought to give a try to google cloud speech.
After all the installations and configurations, I copied the following code snippet to get…

Bilal Ahmed Yaseen
- 2,506
- 2
- 23
- 48
5
votes
1 answer
How to improve the accuracy for speech-to-text conversion using recognize_sphinx API in Python
How can we improve the accuracy of speech to text conversion using recognize_sphinx API in Python?
Please find the below code, which needs to improve the accuracy base!
import speech_recognition as sr
# Obtain path to "english.wav" in the same…

vinay4747
- 61
- 1
- 3
5
votes
1 answer
iOS Speech-to-text AVAudioInputNode(?) random crash
I have a speech-to-text function in my app, press & hold the button; a viewcontroller is animated from outside windowbounds into view and recording starts, release the button; recording stops and view is animated out of windowbounds.
Suddenly I'm…

PeterAntonsen
- 75
- 5
5
votes
1 answer
Android Bluemix not showing speaker tag
I am using IBM bluemix to transcribe some audio, and I want to use the APIs speaker recognition.
I set up the the recognizer like this:
private RecognizeOptions getRecognizeOptions() {
return new RecognizeOptions.Builder()
…

bear
- 663
- 1
- 14
- 33
5
votes
1 answer
How to provide hint to iOS speech recognition API?
I want to create an app that receive voice input using iOS speech API.
In google's API, there is an option for speechContext which I can provide hint or bias to some uncommon words.
Do iOS API provide this feature? I've been searching the site for…

Thanon K
- 177
- 8
5
votes
0 answers
Why Google Cloud Speech API doesn't transcript the whole audio file?
I'm trying to transcript a short interview audio file with Google Cloud Speech API (asynchronously) but it only transcribes the first half minute of the recording. I had several attempts with recordings longer than one minute and the results were…

Gex
- 2,092
- 19
- 26
5
votes
1 answer
Using Bing Speech example with Curl
First off, I don't really know what I'm doing, so I apologize for the stupid question... just trying to follow the instructions here:
https://www.microsoft.com/cognitive-services/en-us/Speech-api/documentation/GetStarted/GetStarted-cURL
using cURL…

AnarchyJim
- 51
- 1
5
votes
1 answer
How to train an lstm for speech recognition
I'm trying to train lstm model for speech recognition but don't know what training data and target data to use. I'm using the LibriSpeech dataset and it contains both audio files and their transcripts. At this point, I know the target data will be…

JorgeC
- 615
- 3
- 7
- 13