Questions tagged [sphinx4]

Sphinx-4 is a part of CMUSphinx Speech Recognition Toolkit. It's a flexible large and small vocabulary speech decoder written in Java and licensed under BSD license.

This tag is about Sphinx-4, a speech recognition decoder. Speech recognition is fastly growing domain and it's quite complex by its nature. The development of the speech recognition application requires understanding of the speech recognition specifics - a probabilistic nature of the results, the need for throughout testing, the specifics of the voice user interface design, the accuracy/speed balance.

The main concept you need to be aware about are acoustic model used to capture the sounds of the language, the language model used to capture the vocabulary and the dictionary which maps from words to sounds. The use of Sphinx-4 in your application is often straightforward but you need to be more careful than usually to get everything in place.

To learn more about CMUSphinx and Sphinx-4 visit CMUSphinx page

https://cmusphinx.github.io/wiki/

Read the tutorial

https://cmusphinx.github.io/wiki/tutorial/

255 questions
1
vote
0 answers

IncompatibleClassChangeError in sphinx4 while trying WSJ jar

I am trying to use Sphinx-4 prealpha release for speech recognition, but I am getting following error: Exception in thread "main" java.lang.IncompatibleClassChangeError: Found class edu.cmu.sphinx.util.props.PropertySheet, but interface was…
Igniter
  • 565
  • 1
  • 4
  • 13
1
vote
0 answers

setting microphone format to Microsoft PCM, 16 bit, mono 8000/16000 Hz in Windows 10

I am making a speech recognition program in java using sphinx4. For that I require the output format of my microphone to be "Microsoft PCM, 16 bit, mono 16000/8000 Hz". I tried to change it from Microphone properties but it shows only 2 options: 2…
Igniter
  • 565
  • 1
  • 4
  • 13
1
vote
0 answers

Using language model tool without any installation

I know that there are some language model tools which are IRSLM, MITLM, SRILM . All of them need to a installation to be able to create a language model etc. However I need a language model tool which is not needed any installation and can be used…
ziLk
  • 3,120
  • 21
  • 45
1
vote
1 answer

sphinx using JSGF grammar with weights: java.lang.NullPointerException exception

I am using the following grammar: #JSGF V1.0; grammar tag; public = +; = | ; = oh | zero | one | two | three | four | five | six |seven | eight | nine ; = a | b | c | d | e | f |…
Mayya Sharipova
  • 405
  • 4
  • 11
1
vote
1 answer

Trying to get a still image to 'talk' when someone talks in JAVA

I've been trying to wrap my head around using sphinx4 to get a still image to animate when my girlfriend talks for twitch.tv. Something much like this general mittenz guy https://www.youtube.com/watch?v=L2oUE-C2g6Y The talking cat is what I'm trying…
Spogsta
  • 11
  • 1
1
vote
1 answer

Recognizing live speech with Sphinx4 java api

I am trying to run the tutorial program for live speech recognition using Sphinx4. This is the main class: public class LiveRecognition { public static void main(String[] args) throws Exception { Configuration configuration = new…
Zobayer Hasan
  • 2,187
  • 6
  • 22
  • 38
1
vote
1 answer

Disable sphinx4 log messages

I'm trying CMUSphinx4 tutorials and I'm getting some weird Info unitManager ,INFO acousticModelLoader etc related stuff in the console when I try to run. I came across same questions here and here. Here they suggested to change value="INFO" to…
user4879707
1
vote
0 answers

Sphinx4 - Is there a faster way to get full text from audio file?

From Sphinx4's demo in TranscriberDemo.java, hypothesis are gotten for each chunk of sound -- is there a faster way to get one final hypothesis at the end of processing? recognizer.startRecognition(stream); SpeechResult result; while…
user2657795
  • 79
  • 1
  • 1
  • 4
1
vote
0 answers

improve the accuracy of decoder

I read that the decoder accuracy can be improved by changing the value of these two paramaetrs relativeBeamWidth and wordInsertionProbabilit, but I do not know where they are located (which file) thnks
sny
  • 21
  • 2
1
vote
1 answer

Voice Activity Detection (VAD/SAR) with LIUM

I wrote a shell script to train several GMMs for some kinds of voice activity and silence. I used LIUM speaker diarization toolkit therefore. I want to use this to do voice activity detection. The following script extracts MFCC features from an wav…
Johann Hagerer
  • 1,048
  • 2
  • 10
  • 28
1
vote
1 answer

Property exception component:'wsjLoader'

In recent days I have been reading alot about modifying the HelloWorld demo file & adding new words in it of our own choice. But I am encountering a serious problem which I am unable to counter. I am listing down my steps & then the error program is…
1
vote
1 answer

Correct parameters for wngram2idngram?

I am trying to generate the arpa format language model with the following commands: text2wngram < weather.txt | grep -v " " > weather.wngram wngram2idngram -vocab weather.vocab < weather.wngram > weather.idngram idngram2lm -vocab_type 0…
g10dras
  • 399
  • 2
  • 11
1
vote
0 answers

Sphinx4 exception from SaxLoader

I am trying to implement Speech Recognizance in Java. I have been following the tutorial and I added required jar files and libraries to the code but it is not working. It shows the error. Following is my code package demo.sphinx.helloworld; import…
1
vote
0 answers

JSFG grammar not parsing

I can't see where im breaking my jsfg grammar (for Sphinx4). Or is there any debugging tool for parsing? This compiles: #JSGF V1.0; grammar dialog; = oh | zero | one | two | three | …
jsky
  • 2,225
  • 5
  • 38
  • 54
1
vote
1 answer

CMU Sphinx4 - Custom Language Model

I have a very specific requirement. I am working on an application which will allow users to speak their employee number which is of the format HN56C12345 (any alphanumeric characters sequence) into the app. I have gone through the link:…
Qedrix
  • 453
  • 1
  • 8
  • 15