Questions tagged [speech-to-text]

The translation of spoken words into text. Possible synonyms include automatic speech recognition, ASR, computer speech recognition, speech to text, STT.

2372 questions
8
votes
1 answer

How get last Spoken word from SFSpeechRecognitionResult

I am implementing a speech recognition process to convert using SFSpeechRecognizer. Need to implement erase option to remove the last character. But SFSpeechRecognitionResult, result.bestTranscription.formattedString always returns a whole string…
Siva
  • 700
  • 6
  • 25
8
votes
2 answers

wave.Error: unknown format: 3 arises when trying to convert a wav file into text in Python

I need to record an audio from the microphone and convert it into text. I have tried this conversion process using several audio clips that I downloaded from the web and it works fine. But when I try to convert the audio clip I recorded from the…
Hirushi Ekanayake
  • 115
  • 1
  • 1
  • 7
8
votes
4 answers

Flutter/Dart: speech to text (offline and continuous) for any language

Is there any package that I can use to create an app that can process speech to text? It should include the following features: offline speech to text continuous listening(10-30 minutes) recognizer would work for all languages that Android/IOS…
Chris
  • 6,105
  • 6
  • 38
  • 55
8
votes
2 answers

Watson speech to text returns strange error regarding file size

Just started to play with Watson's Voice API. Trying to use their demo file audio-file.flac. You'd have to take my word for it that I'm posting the curl command from the directory where it resides, and that according to the ls-l command the file…
Daniel Kaplan
  • 701
  • 7
  • 19
8
votes
2 answers

how to detect language spoken in google cloud platform machine learning speech api

Is there an option to automatically detect the spoken language using Google Cloud Platform Machine Learning's Speech API? https://cloud.google.com/speech/docs/languages indicates the list of the languages supported and user needs to be manually set…
8
votes
3 answers

Phone number and Date of Birth from human speech

Is there an effective Natural Language Processor that can fetch the phone number and date of birth from human speech. Each user has a different way of specifying the phone number and date of birth. Hence, converting speech to text and then parsing…
Ashok Vairavan
  • 1,862
  • 1
  • 15
  • 21
8
votes
2 answers

Alexa - How to accept free text as input / slot. Is there any way apart from using a custom slot and providing a huge list?

How to accept free text as input / slot? Is there any way apart from using a custom slot and providing a huge list? Since Literal slot types are deprecated, how to provide a free text/string input to alexa?
Tushar Joshi
  • 93
  • 1
  • 4
8
votes
1 answer

TargetInvocationException when using SemanticResultKey

I want to build my grammar to accept multiple number. It has a bug when I repeat the number like saying 'twenty-one'. So I kept reducing my code to find the problem. I reached the following piece of code for the grammar builder: string[]…
Kasparov92
  • 1,365
  • 4
  • 14
  • 39
8
votes
2 answers

Cross Browser Speech Recognition

I am currently working on a project in ASP.NET. I need to add voice command which will work on IE/Chrome/Firefox. I have searched a lot, but haven't found any solutions for cross browser. Is there any JavaScript framework to do it? Can i use Google…
8
votes
3 answers

How to activate speech to text with a button?

I would like to implement a button that when clicked would activate android's speech to text translator like the one provided by android's keyboard. Specifically, I would like a button that would have the app transcribe what the user is saying in…
Matt Fritze
  • 325
  • 1
  • 5
  • 17
8
votes
2 answers

Burmese speech to text conversion in android?

Can we add custom language for RecognizerIntent? I have search many SO Question like https://stackoverflow.com/questions/2080401/is-there-a-speech-to-text-api-by-google That solve my problem of using limited number of language during Speech to Text…
7
votes
2 answers

Can CMU Sphinx be set up to recognize ~200 words

I have a client who needs an Android App that can recognize spoken commands. From what I understand the built-in voice to text functionality actually sends data to Google's servers which then sends back a text translation. This is a major problem,…
lots_of_questions
  • 1,109
  • 3
  • 16
  • 24
7
votes
0 answers

Convert speech to text in flutter for desktop(Windows) application

I am developing a cross platform application in flutter and cannot find any sources for converting speech to text in desktop(windows). I tried using packages like speech_to_text I even tried google_speech in which it require an audio file to…
7
votes
0 answers

How to disable disfluency removal for Google Cloud Speech to Text API

I am building an app that captures user audio and analyzes disfluency in a reader's speech, so it it important for me to know all forms of disfluency. I noticed that Google's speech to text cloud API automatically removes disfluencies in speech.…
AspiringMat
  • 2,161
  • 2
  • 21
  • 33
7
votes
1 answer

Batch transcription with Microsoft Azure (REST API)

I want transcribe longer audio files (at least 5 minutes) using REST APIs from Microsoft. There are a lot of different products and names, e.g. Speech service API or Bing Speech API. None of the REST APIs I tried so far supports transcribing longer…