0

All the questions are related to the demo project in android for pocketsphinx, given on the official site of CMUSphinx.

I don't understand what the method switchSearch() really does. The method is using KWS_SEARCH="wakeup" attribute, what is the use of this attribute? It doesn't appear in the grammar files (.gram). What is the purpose of this attribute? The method compares the searchName with KWS_SEARCH, I don't know why. The KWS_SEARCH it's also passed as a parameter to startListening() method on the recognizer object. Why?

I dont' understand how working with a timeout of 10000ms improves the result.

This is the switchSearch() method:

private void switchSearch(String searchName) {
    recognizer.stop();

    // If we are not spotting, start listening with timeout (10000 ms or 10 seconds).
    if (searchName.equals(KWS_SEARCH))
        recognizer.startListening(searchName);
    else
        recognizer.startListening(searchName, 10000);

    String caption = getResources().getString(captions.get(searchName));
    ((TextView) findViewById(R.id.caption_text)).setText(caption);
}
Nikolay Shmyrev
  • 24,897
  • 5
  • 43
  • 87

1 Answers1

0

From Pocketsphinx tutorial:

Developer can configure several “search” objects with different grammars and language models and switch them in runtime to provide interactive experience for the user.

There are different possible search modes:

  • keyword - efficiently looks for keyphrase and ignores other speech. allows to configure detection threshold.
  • grammar - recognizes speech according to JSGF grammar. Unlike keyphrase grammar search doesn't ignore words which are not in grammar but tries to recognize them.
  • ngram/lm - recognizes natural speech with a language model.
  • allphone - recognizes phonemes with a phonetic language model. Each search has a name and can be referenced by a name, names are application-specific. The function ps_set_search allows to activate the search previously added by a name.

To add the search one needs to point to the grammar/language model describing the search. The location of the grammar is specific to the application. If only a simple recognition is required it is sufficient to add a single search or just configure the required mode with configuration options.

The exact design of a searches depends on your application. For example, you might want to listen for activation keyword first and once keyword is recognized switch to ngram search to recognize actual command. Once you recognized the command you can switch to grammar search to recognize the confirmation and then switch back to keyword listening mode to wait for another command.

I don't understand what the method switchSearch() really does. The method is using KWS_SEARCH="wakeup" attribute, what is the use of this attribute?

"wakeup" is the search name of the keyword spotting search. It was added when recognizer was initialized. Name can be arbitrary, it just identifies the search.

The KWS_SEARCH it's also passed as a parameter to startListening() method on the recognizer object. Why?

startListening starts listening with the named search.

I dont' understand how working with a timeout of 10000ms improves the result.

It has nothing about result, it is just a user experience. When we spot for a word continuously, we do not need a timeout. When we recognize a command we wait for 10 seconds and then return back to spotting mode.

Community
  • 1
  • 1
Nikolay Shmyrev
  • 24,897
  • 5
  • 43
  • 87