0

I want to do speech recognition using Sphinx.

I'm looking to capture the output/incoming audio of the Ekiga VOIP softphone using Java or Python and pass it on to Sphinx. Right now, the output is directed toward the PulseAudio JACK Sink.

Sorry if I am not using the right terminology as I am quite a noob at this.

If you could point me in the right direction, it would be much appreciated, thanks.

Max Pie
  • 99
  • 2
  • 9

1 Answers1

0

You need to properly formalate the task you want to achive and select the right tools for that. Software developers do not use desktop applications like Ekiga for their projects. It's just because desktop applications are written for the users, they are not for software developers. You can only modify desktop applications like Ekiga in the source code, but that is a whole different story.

If you want to have a VOIP endpoint and connect it to a speech recognition service, you need to look on IVR systems and similar things. IVR system is a specific tool to implement complex interactions over the VOIP protocols. Some examples are:

Freeswitch

Asterisk

GNU Gatekeeper - a project is built on the same Opal technology as Ekiga

You can setup freeswitch to wait for the calls and pass them to the pcoketsphinx. You can do the same with Asterisk. For more details see

How to use Pocketpshinx from Freeswitch

How to integrate Pocketsphinx in Asterisk using UniMRCP project

Nikolay Shmyrev
  • 24,897
  • 5
  • 43
  • 87
  • Thanks for the guidance... Should I worry about separating the incoming and outgoing voice or is sphinx powerful enough to make the difference? – Max Pie Nov 30 '12 at 15:46
  • In VOIP channels are processed separately by default, you don't need to care about separation. CMUSphinx does not support the functionality to separate voices on a single channel, there are other tools for that. – Nikolay Shmyrev Nov 30 '12 at 15:49
  • Thanks. I've looked into the Asterisk AGI and I think I have enough to go on for now. – Max Pie Nov 30 '12 at 15:55