Questions tagged [speech-recognition]

Speech recognition (SR) is the inter-disciplinary sub-field of computational linguistics which incorporates knowledge and research in the linguistics, computer science, and electrical engineering fields to develop methodologies and technologies that enables the recognition and translation of spoken language into text by computers and computerized devices such as those categorized as smart technologies and robotics

Speech recognition describes the process of analyzing an audio signal stream and convert it to text. This includes splitting the input into meaningful bits of information (tokenization) and jugding the relevance of a signal. The latter means distinguishing between irrelevant signals (e.g. noise in the background) and words to be recognized.

Ideally, speech recognition or speech-to-text mechanisms should not be biased towards a specific voice. It should be capable of recognizing arbitrary voices.

Further information

5380 questions
26
votes
9 answers

iPhone: Speech Recognition is in IOS SDK available?

Does anyone knows that if "speech to text" and "text to speech" api's used in Siri are accessible in IOS 5 or IOS 6 SDK? I researched but couldn't find anything about it in documentation, so if thats not included in SDK are there any "Siri" quality…
Spring
  • 11,333
  • 29
  • 116
  • 185
26
votes
5 answers

Android speech recognizing and audio recording in the same time

My application records audio using MediaRecorder class in AsyncTask and also use Google API transform speech to text - Recognizer Intent - using the code from this question : How can I use speech recognition without the annoying dialog in android…
woyaru
  • 5,544
  • 13
  • 54
  • 92
26
votes
8 answers

Open source code for voice detection and discrimination

I have 15 audio tapes, one of which I believe contains an old recording of my grandmother and myself talking. A quick attempt to find the right place didn't turn it up. I don't want to listen to 20 hours of tape to find it. The location may not…
Croad Langshan
  • 2,646
  • 3
  • 24
  • 37
26
votes
2 answers

Is there a way to use iOS speech recognition in offline mode?

I want to know if there's a way to use iOS speech recognition in offline mode. According to the documentation (https://developer.apple.com/reference/speech) I didn't see anything about it.
Danyl
  • 2,012
  • 3
  • 19
  • 40
25
votes
3 answers

Audio analysis to detect human voice, gender, age and emotion -- any prior open-source work done?

Is there prior open-source work done in the field of 'Audio analysis' to detect human-voice (say in spite of some background noise), determine speaker's gender, possibly determine no. of speakers, age of speaker(s), and the emotion of speakers? My…
mike.dinnone
  • 732
  • 2
  • 8
  • 17
25
votes
5 answers

Is there a way to use the SpeechRecognizer API directly for speech input?

The Android Dev website provides an example of doing speech input using the built-in Google Speech Input Activity. The activity displays a pre-configured pop-up with the mic and passes its results using onActivityResult() My question: Is there a…
vladimir.vivien
  • 512
  • 1
  • 6
  • 8
25
votes
6 answers

Python pocketsphinx RequestError: missing PocketSphinx module: ensure that PocketSphinx is set up correctly

I am trying to make a Python app that can record audio and translate it into english text using PyAudio, SpeechRecognition and PocketSphinx. I'm running on a Mac OS X El Capitan, version 10.11.2. Following a tutorial like this one and others, I've…
25
votes
2 answers

Add iOS speech recognition support for web app?

Currently, the HTML5 web speech api works great on google chrome for all devices except mobile iOS. Text-to-speech works, but speech-to-text is not supported. webkitSpeechRecognition is not supported. See: Chrome iOS webkit speech-recognition I am…
expireD
  • 361
  • 3
  • 4
25
votes
1 answer

Google speech API

I'm now working with my project and I'm about to build a Siri-like application for the desktop computer. I am thinking if Google Speech API is reliable and accurate for speech recognition? Can you suggest to me what speech API is the most accurate…
Dheby Chan
  • 267
  • 1
  • 3
  • 4
24
votes
3 answers

Google Speech Recognition API: timestamp for each word?

It's possible to use Google's Speech recognition API to get a transcription for an audio file (WAV, MP3, etc.) by doing a request to http://www.google.com/speech-api/v2/recognize?... Example: I have said "one two three for five" in a WAV file.…
Basj
  • 41,386
  • 99
  • 383
  • 673
24
votes
1 answer

Search for a particular spoken word in audio files

I have around 3000+ audio files of the same author. I need to transcribe those lectures, where the author has said about a particular word. So I need a software solution, which will find automatically all the files where the particular word is…
amol_beast
  • 281
  • 1
  • 2
  • 8
23
votes
3 answers

Android App Integrated with OK Google

Is there a way to issue a voice command something like: OK GOOGLE ASK XXX Some App Specific Question or Command And have it launch "APP" with the recognized text: "Some App Specific Question or Command" My app has speech recognition as a service…
r.t.s.
  • 585
  • 1
  • 4
  • 12
23
votes
2 answers

Android Speech Recognition Continuous Service

I'm trying to create a service to run continuous speech recognition in Android 4.2. Using the answer from this link ( Android Speech Recognition as a service on Android 4.1 & 4.2 ), I created a service that is run from an Activity. My problem is…
rmooney
  • 6,123
  • 3
  • 29
  • 29
23
votes
8 answers

How to set the language in speech recognition on android?

I've been working on speech Recognition API in android and found out that the speech results vary allot when the language settings are changed , is there a way to set it programmatically ? or is there an intent to lunch the speech language settings…
Mr.Me
  • 9,192
  • 5
  • 39
  • 51
22
votes
4 answers

Is there an API for Google's speech recognition technology?

I want to try creating a jQuery slideshow using simple voice commands like "next" or "previous". Is there a way to use Google's voice recognition? I know about Chrome's x-webkit-speech, but I have to click a button to use it. I tried MIT's WAMI, but…
Leo Jiang
  • 24,497
  • 49
  • 154
  • 284