Questions tagged [sphinx4]

Sphinx-4 is a part of CMUSphinx Speech Recognition Toolkit. It's a flexible large and small vocabulary speech decoder written in Java and licensed under BSD license.

This tag is about Sphinx-4, a speech recognition decoder. Speech recognition is fastly growing domain and it's quite complex by its nature. The development of the speech recognition application requires understanding of the speech recognition specifics - a probabilistic nature of the results, the need for throughout testing, the specifics of the voice user interface design, the accuracy/speed balance.

The main concept you need to be aware about are acoustic model used to capture the sounds of the language, the language model used to capture the vocabulary and the dictionary which maps from words to sounds. The use of Sphinx-4 in your application is often straightforward but you need to be more careful than usually to get everything in place.

To learn more about CMUSphinx and Sphinx-4 visit CMUSphinx page

https://cmusphinx.github.io/wiki/

Read the tutorial

https://cmusphinx.github.io/wiki/tutorial/

255 questions
1
vote
1 answer

Sphinx4 ConfidenceResult and SpeechResult

I'm trying to get the confidence score of a SpeechResult by doing ConfidenceResult cr = scorer.score(result); Where result is a SpeechResult and scorer is a ConfidenceScorer. As it turns out this isn't allowed. Is there some way around this that I'm…
Ryan Faulhaber
  • 76
  • 1
  • 3
  • 7
1
vote
1 answer

Language Models and Sphinx4

I'm new to Sphinx and I'm trying to write a program that will recognize a word in an audio file that will only contain a single spoken word and then rate the confidence. For a project like this a language model doesn't seem necessary, seeing as how…
Ryan Faulhaber
  • 76
  • 1
  • 3
  • 7
1
vote
1 answer

Using GRXML grammars in Sphinx4

Does sphinx support the use of GR-XML grammars, or do I have to convert my existing grammar to the java speech grammar format?
1
vote
1 answer

Saving utterance to audio file with Sphinx4

I'm using Sphinx4 to perform speech recognition with a grammar, but I want, for another purpose, saving to an audio file that the user said without a grammar. Basically the user says something and when it's silent an audio file is created and I want…
1
vote
1 answer

Using a simple language model in Sphinx-4

I've learned the basics of how to use Shpinx-4 as a speech recognition Toolkit. I've wrote number of sentences to build the language model for my small project(As a start 10 sentences) to train since the performance of the SR was not very good for…
Evanescence
  • 729
  • 1
  • 10
  • 25
1
vote
0 answers

Poor accuracy Sphinx4

I intend to use the sphinx4 to translate voice to text. I've been reading a few tutorials and reviews to improve the accuracy and I'm using the following adaptations: Acoustic Model EN-US Language Model EN-US The use of generic acoustic model and…
1
vote
0 answers

getting a wordToken timestamp from result.getTimedBestResult()

Im in the process of creating a subtitle generator for generic videos. One of the major blockers is getting the timestamp for each word to align with the video, which is kinda killing me at the moment. The result class has a getTimedBestResult()…
1
vote
2 answers

Motor Control using speech

My aim is to control a motor using the speech input from the user. Thus for the speech recognition part i'm using the Sphinx 4 library with Eclipse JAVA IDE (Standard version). My operating system is windows 7. My Recognition part is over so the…
Randu
  • 51
  • 1
  • 1
  • 4
1
vote
3 answers

Is there a repository of grammars for CMU Sphinx?

I'm writing an (offline) voice recognition app. I have CMU Sphinx4 set up and working using some of the included demo dictionaries. However, they're of limited scope (eg..numbers, cities, etc). Is there a more comprehensive grammar available? …
Ari Russo
  • 273
  • 2
  • 10
1
vote
2 answers

java speech recognition Sphinx 4

I want to use either sphinx4 or the HTK toolkit to build me a speech recognition application that aims to estimate ones age from voice. I understand, to a greater extent, the ststistical models involved in speech recognition. I am interested in Mel…
Faiyet
  • 5,341
  • 14
  • 51
  • 66
1
vote
1 answer

"Falling back to non-recursive partition" sphinx 4

I trained my Acoustic model and received acceptable accurate rate (85%) over a small data (10 Vietnamese words). but when I integrate this model in to Transcriber sample (packaged with Sphinx 4) and try to transcribe a word, which is in the 10 above…
Minh Triet
  • 1,190
  • 1
  • 15
  • 35
1
vote
1 answer

how to import and use a trained acoustic model in a java sphinx4 project

i need help in making a program in java language that is a speech recognition program i have a trained acoustic model i want to ask u that how can i use this trained acoustic model in my program i am new to speech recognition platform and i want to…
1
vote
1 answer

How to install sphinx4?

For the vast majority of you this will probably be straightforward, but I need help installing the sphinx4 speech recognition software. In particular, using cygwin to do so. 1) how does one set the environmental path variable to the java sdk (I had…
1
vote
1 answer

KeyListener isShiftDown() is reading shift is down when not

So I'm building a sphinx-4 program that will only listen when you hold down the shift button. This is so that I can prevent errors and only have it listen to me when I'm holding down the shift button. When I release the shift button, I want the…
angyxpoo
  • 193
  • 2
  • 5
  • 16
1
vote
1 answer

Sphinx4 configurations needed for speech to text translation

I currently am working on Sphinx4 and more specifically the TranslatorDemo. However when I run it its default dictionary and model is to only output digits. The instructions say to change the config.xml file for this particular model which I have…
applecrusher
  • 5,508
  • 5
  • 39
  • 89