Questions tagged [speech-to-text]

The translation of spoken words into text. Possible synonyms include automatic speech recognition, ASR, computer speech recognition, speech to text, STT.

2372 questions
5
votes
1 answer

Speech to Text on Android with custom unusual word matching

I would like to be able to use Android's Speech-To-Text engine to recognize a variety of unusual words in sentences. To give an example, the word "electroencephalograph" comes out of STT as "electronics supply graph". When I use Soundex or…
Buns of Aluminum
  • 2,439
  • 3
  • 26
  • 44
5
votes
2 answers

Sample example for Speech to Text in iOS

I am new to iOS programming.Can you please tell me how to convert the speech to text in iOS?Is there any API called?Please suggest me how to proceed?
rani
  • 593
  • 3
  • 10
  • 28
5
votes
2 answers

Which minimum android version is required for speech to text application

I have done the coding for converting speech into text.I just want to know which minimum version of android required for this.
Nitin Gupta
  • 236
  • 3
  • 10
4
votes
1 answer

Trying to use Google Speech2Text in C#

The following simple code tries to post a wave file to Google Speech2Text service, but always fails with either a "Gateway Timeout (504)" or general exception "The operation timed out". Can anyone help please? public void ProcessWaveFile(string…
dotNET
  • 33,414
  • 24
  • 162
  • 251
4
votes
0 answers

How to use grammar text editors for Speech-to-Text documents in JavaScript / NodeJS

I'm relatively new to programming (1 year working as an intern, and finishing grad), and might be biting more than I can chew, this is also my first interaction here (yey) So let me explain the problem thoroughly: I'm currently using Google Speech…
4
votes
1 answer

Efficient speaker diarization

I am running a VM instance on google cloud. My goal is to apply speaker diarization to several .wav files stored on cloud buckets. I have tried the following alternatives with the subsequent problems: Speaker diarization on Google's API. This seems…
Luis
  • 330
  • 1
  • 11
4
votes
1 answer

What are the ways to implement speech recognition in Electron?

So I have an Electron app that uses the web speech API (SpeechRecognition) to take the user's voice, however, it's not working. The code: if ("webkitSpeechRecognition" in window) { let SpeechRecognition = window.SpeechRecognition ||…
4
votes
1 answer

(Mis)-using open.ai whisper for text-to-text translation

I noticed that transcribing speech in multiple languages with openai whisper speech-to-text library sometimes accurately recognizes inserts in another language and would provide the expected output, for example: 八十多个人 is the same as 八十几个人. So 多 and…
4
votes
1 answer

Adding transcriptions to Google Speech-to-text to enhance recognition

In our church we have a few Ukrain refugees that visit the churc. To give them un understanding of the sermon, I made an app to send the translations real-time to Telegram. I have implemented the Google speech-to-text API following this tutorial:…
4
votes
2 answers

SpeechBrain: Cannot Load Pretrained Model from Local Path

I'm trying to load a pretrained SpeechBrain HuggingFace model from local files; I don't want it to call out to HuggingFace to download. However, unless I change the pretrained_path in hyperparams.yaml, it is still calling out to HuggingFace and…
4
votes
2 answers

Voice to Text recognition

I am a beginner in Android development. Is there any possibility to write a speech to text software that could access Googles network based back-end voice to text system ?
Illep
  • 16,375
  • 46
  • 171
  • 302
4
votes
0 answers

Android Speech Recognition Custom Audio Source

Android RecognizerIntent documentation states that public static final String EXTRA_AUDIO_INJECT_SOURCE The extra key used in intent is providing an already opened audio source for the RecognitionService to use. Data should be a URI to an audio…
Vishal
  • 53
  • 2
4
votes
1 answer

How to get a transcript of an audio or video call within a js web app? I.e. how to route a MediaStream to a speech-to-text API

I want to make a web-app which does video calls with live transcription -- using some 3rd party speech-to-text service (e.g. Google or Amazon). So the peer-to-peer MediaStream would be played to the users, and also sent to the API for…
4
votes
5 answers

Speech To Text using C#

I am trying to design a text editor using C# language and implement voice recognition for the normal file features , is this possible to implement. I am very sorry if I am repeating the question which has been asked previously. I just want to know…
ArunKumar
  • 41
  • 1
  • 2
4
votes
1 answer

How to feed an audio file from S3 bucket directly to Google speech-to-text

We are developing a speech application using Google's speech-to-text API. Now our data (audio files) get stored in S3 bucket on AWS. is there a way to directly pass the S3 URI to Google's speech-to-text API? From their documentation it seems this is…