0

I'm using cmusphinx for text alignment. I downloaded the latest sphinx4, build a text aligner by modifying one of the demo using the WSJ acoustic models and dictionaries that comes along with the code. It does work occacionally but for lots of quite good pronunciation aligning simple text it just fails.

What would be the reason? Is it the language models I use is too limited and I should be downloading more model data to feed the recogniser? Is there any good prepackaged sphinx distribution that saves me from testing with different language models and configuring the software?

And thanks a lot :)

Here's the codes I think that'd matters,

byte[] bytes = readContentOfAOggFile();
ByteArrayInputStream inputStream = new ByteArrayInputStream(bytes);

grammar = (ResetableTextAlignGrammar) cm.lookup("textAlignGrammar");
grammar.setTextAfterAllocation(referenceText);


AudioInputStream ai = AudioSystem.getAudioInputStream(inputStream);
dataSource.setInputStream(ai, null);
dataSource = (AudioFileDataSource) cm.lookup("audioFileDataSource");
dataSource.setInputStream(stream, null);

result = recognizer.recognize();

Please note that this code works for half single word sentences.

tactoth
  • 897
  • 1
  • 12
  • 24

1 Answers1

0

What would be the reason?

You need to share the data you are trying to get an answer on that

Is it the language models I use is too limited and I should be downloading more model data to feed the recognizer?

Unlikely

Is there any good prepackaged sphinx distribution that saves me from testing with different language models and configuring the software?

Once you share your test data, it's easier to say what is going on there.

Nikolay Shmyrev
  • 24,897
  • 5
  • 43
  • 87