I'm running PocketSphinx on Android (version 5prealpha). I'm using a file-defined keyword recognizer, specified by the following snippet (kwfile
is the keyword definition file, and mRecognizer
is an instance of SpeechRecognizer):
mRecognizer.addKeywordSearch(DESCRIPTOR, kwfile);
Overall, the recognition performance is pretty good, after having optimized the keyword thresholds. However, if I wait some arbitrary amount of time (5 sec up to several minutes) between one keyword utterance and the next, the recognition performance suffers on the second utterance. For example, I'll say "keyword," and it will be recognized. If I wait less than 5 sec and say "keyword" again, the second utterance will likely be recognized (recognition rate over 95%). If, however, I wait 15 sec, the recognition rate drops dramatically, to less than 50%.
My hypothesis is that when I say the keyword the second time, the recognizer is in the middle of a refresh - that is it's between a Stop Recognition
event and a Start Recognition
event, and that my speech transcends that event. Here is a typical view of my logcat. Notice that after 5 sec, the recognizer "refreshes". This happens about every 5 sec, for the most part. Sometimes it can be as long as 30 sec between "refreshes", but generally it's around 5 sec.
09-26 07:11:06.800 20397-20397/...﹕ Start recognition "kwfile"
09-26 07:11:06.815 20397-23642/...﹕ Starting decoding
09-26 07:11:11.310 20397-20397/...﹕ Stop recognition
09-26 07:11:11.315 20397-20397/...﹕ Start recognition "kwfile"
09-26 07:11:11.360 20397-23645/...﹕ Starting decoding
09-26 07:11:17.405 20397-20397/...﹕ Stop recognition
So, my question is: Is there anything I can do to control this "refresh rate"? Is this caused by something I'm doing wrong in my RecognitionListener
implementation (see below, but note - I typically don't get any partial results between utterances.)? Or is there a PocketSphinx API call that I don't know about to set this refresh rate? Or, is there something I could change in the PocketSphinx source to improve this behavior?
class VoiceListener implements RecognitionListener{
private boolean isCommand = false;
@Override
public void onBeginningOfSpeech() {
Log.d(TAG,"Beginning of Speech");
// do nothing
}
@Override
public void onEndOfSpeech() {
Log.d(TAG,"End of Speech");
// do nothing
}
@Override
public void onPartialResult(Hypothesis arg0) {
if( arg0 != null){
Log.d(TAG, "Partial results list: " + arg0.getHypstr());
isCommand = false;
// handle recognition results for keywords
for( String command : this.getCurrentCommands() ) {
if (arg0.getHypstr().contains(command)) {
this.onRecognition(arg0.getHypStr());
isCommand = true;
mRecognizer.stop();
}
}
// call stop, and let onResults() handle grammar results
if( arg0.getHypstr().contains(Command.STOP_WORD))
mRecognizer.stop();
}
}
@Override
public void onResult(Hypothesis results) {
String data;
if( results == null ){
data = null;
}else{
data = results.getHypstr();
}
Log.d(TAG,"Final results: " + data );
// handle grammar recognition results
if( !isCommand ){
this.onRecognition(data);
}
return;
}