Can you use the same decoder in Pocketsphinx for multiple files?

Question

Is it possible to use the same decoder for multiple wav files in Pocketsphinx (Python)? I have the following code snippet, which is very standard, except that I call the decoder twice on the same file. The outputs are not the same, however. I've also tried using the decoder twice on different files, and the outputs are different depending on the order in which I call the files - the first file decodes correctly, but the second file does not decode correctly. Furthermore, this only happens if there is some output from the first file - if the first file doesn't have any words, then the second file decodes fine. This makes me believe the decoder is modified in some way after decoding one file. Am I correct about this? Is there any way to reset the decoder, or in general make it work for multiple files? It seems like there should be given the example here: https://github.com/cmusphinx/pocketsphinx/blob/master/swig/python/test/decoder_test.py.

config = ps.Decoder.default_config()    
config.set_string('-hmm', os.path.join(MODELDIR, 'en-US/acoustic-model'))
config.set_string('-lm', os.path.join(MODELDIR, 'en-US/language-model.lm.bin'))
config.set_string('-dict', os.path.join(MODELDIR, 'en-US/pronounciation-dictionary.dict'))
config.set_string('-logfn', 'pocketsphinxlog')
decoder = ps.Decoder(config)

wavname16_1 =  os.path.join(DATADIR, 'arctic_a0001.wav')
# Decode streaming data.
decoder.start_utt()
stream = open(wavname16_1, 'rb')
while True:
    buf = stream.read(1024)
    if buf:
        decoder.process_raw(buf, False, False)
    else:
        break
decoder.end_utt()
stream.close()
words = [(seg.word, seg.prob) for seg in decoder.seg()]
print words

wavname16_2 =  os.path.join(DATADIR, 'arctic_a0002.wav')
decoder.start_utt()
stream = open(wavname16_2, 'rb')
while True:
    buf = stream.read(1024)
    if buf:
        decoder.process_raw(buf, False, False)
    else:
        break
decoder.end_utt()
stream.close()
words = [(seg.word, seg.prob) for seg in decoder.seg()]
print "arctic2: " + words

EDIT - Some further information:

If arctic_a0001.wav is http://festvox.org/cmu_arctic/cmu_arctic/cmu_us_bdl_arctic/wav/arctic_a0001.wav, arctic_a0002.wav is http://festvox.org/cmu_arctic/cmu_arctic/cmu_us_bdl_arctic/wav/arctic_a0002.wav, and the dictionary is the single line:

of AH V

then the current output is:

arctic1: [('<s>', 1), ('of', 1), ('of', -12001), ('<sil>', 0), ('of', -16211), ('<sil>', -1205), ('of', -13991), ('of', 0), ('<sil>', 0), ('of', -31232), ('</s>', 0)]
arctic2: [('<s>', -3), ('[SPEECH]', -725), ('<sil>', -1), ('[SPEECH]', -6), ('<sil>', -20), ('of', -6162), ('[SPEECH]', -397), ('</s>', 0)]

but if we switch them, the output becomes

arctic2: [('<s>', 0), ('of', 0), ('<sil>', 0), ('of', -29945), ('<sil>', -20), ('of', -26004), ('of', 0), ('of', 0), ('<sil>', 0), ('of', -84868), ('of', -35690), ('</s>', 0)]
arctic1: [('<s>', -3), ('of', -14886), ('of', -30237), ('<sil>', 0), ('of', -22103), ('of', 1), ('<sil>', 0), ('of', -30795), ('of', -65040), ('</s>', 0)]

so the outputs of arctic1 and arctic2 depend on the order. Furthermore, if we use arctic1 twice, the output is

[('<s>', 1), ('of', 1), ('of', -12001), ('<sil>', 0), ('of', -16211), ('<sil>', -1205), ('of', -13991), ('of', 0), ('<sil>', 0), ('of', -31232), ('</s>', 0)]
[('<s>', 1), ('of', -24424), ('of', -24554), ('<sil>', 2), ('[SPEECH]', -37257), ('of', -37008), ('<sil>', -461), ('of', -20422), ('of', 0), ('<sil>', 0), ('of', -3570), ('[SPEECH]', -42), ('</s>', 0)]

Maybe it is a problem with me not using start_stream()? I am not sure how I should use it. Even if I use decoder.start_stream() (directly before decoder.start_utt()), the output is different - it becomes

[('<s>', 1), ('of', 1), ('of', -12001), ('<sil>', 0), ('of', -16211), ('<sil>', -1205), ('of', -13991), ('of', 0), ('<sil>', 0), ('of', -31232), ('</s>', 0)]
[('<s>', -2), ('of', -33113), ('of', -29715), ('<sil>', 1), ('[SPEECH]', -37258), ('of', -37009), ('<sil>', -461), ('of', -20422), ('of', 0), ('<sil>', 0), ('of', -3570), ('[SPEECH]', -42), ('</s>', 0)]

If you want the entire log, here (http://pastebin.com/2dNeyS1x) is the log for arctic1 before arctic2, and here (http://pastebin.com/Nkvj2G0g) is the log for arctic2 before arctic1, while here is the log for arctic1 two times in a row with start_stream (http://pastebin.com/HWq6j7X2), and here is the log for arctic1 two times in a row without start_stream (http://pastebin.com/MsadW4nh).

score 0 · Accepted Answer · answered Aug 05 '16 at 20:18

0

Is it possible to use the same decoder for multiple wav files in Pocketsphinx (Python)?

Yes

I have the following code snippet, which is very standard, except that I call the decoder twice on the same file. The outputs are not the same, however.

You need to call decoder.start_stream() for the second file to reset the decoder timings.

I've also tried using the decoder twice on different files, and the outputs are different depending on the order in which I call the files - the first file decodes correctly, but the second file does not decode correctly. Furthermore, this only happens if there is some output from the first file - if the first file doesn't have any words, then the second file decodes fine.

Well, there could be different things what affect result. It is hard to say without example. You'd better provide sample files and the problematic output to get an answer on this question.

answered Aug 05 '16 at 20:18

Nikolay Shmyrev

24,897
5
43
87

Hi, I won't be able to respond to this until Monday because my files are at work, but I just wanted to thank you for responding so quickly! – user6003782 Aug 06 '16 at 01:45
I edited the parent post to include example files and output! Let me know if you need anything else. – user6003782 Aug 08 '16 at 14:17
I don't see anything wrong in the logs. Result might be slightly different because decoder keeps internal state (CMN value), you can see it in the logs. Third iteration should be same as the second one. – Nikolay Shmyrev Aug 10 '16 at 15:33
Thanks, is there a way to not keep the internal state / make it static? – user6003782 Aug 10 '16 at 20:34
There is no way yet – Nikolay Shmyrev Aug 11 '16 at 09:13

Can you use the same decoder in Pocketsphinx for multiple files?

1 Answers1