Is it possible to use the same decoder for multiple wav files in Pocketsphinx (Python)? I have the following code snippet, which is very standard, except that I call the decoder twice on the same file. The outputs are not the same, however. I've also tried using the decoder twice on different files, and the outputs are different depending on the order in which I call the files - the first file decodes correctly, but the second file does not decode correctly. Furthermore, this only happens if there is some output from the first file - if the first file doesn't have any words, then the second file decodes fine. This makes me believe the decoder is modified in some way after decoding one file. Am I correct about this? Is there any way to reset the decoder, or in general make it work for multiple files? It seems like there should be given the example here: https://github.com/cmusphinx/pocketsphinx/blob/master/swig/python/test/decoder_test.py.
config = ps.Decoder.default_config()
config.set_string('-hmm', os.path.join(MODELDIR, 'en-US/acoustic-model'))
config.set_string('-lm', os.path.join(MODELDIR, 'en-US/language-model.lm.bin'))
config.set_string('-dict', os.path.join(MODELDIR, 'en-US/pronounciation-dictionary.dict'))
config.set_string('-logfn', 'pocketsphinxlog')
decoder = ps.Decoder(config)
wavname16_1 = os.path.join(DATADIR, 'arctic_a0001.wav')
# Decode streaming data.
decoder.start_utt()
stream = open(wavname16_1, 'rb')
while True:
buf = stream.read(1024)
if buf:
decoder.process_raw(buf, False, False)
else:
break
decoder.end_utt()
stream.close()
words = [(seg.word, seg.prob) for seg in decoder.seg()]
print words
wavname16_2 = os.path.join(DATADIR, 'arctic_a0002.wav')
decoder.start_utt()
stream = open(wavname16_2, 'rb')
while True:
buf = stream.read(1024)
if buf:
decoder.process_raw(buf, False, False)
else:
break
decoder.end_utt()
stream.close()
words = [(seg.word, seg.prob) for seg in decoder.seg()]
print "arctic2: " + words
EDIT - Some further information:
If arctic_a0001.wav is http://festvox.org/cmu_arctic/cmu_arctic/cmu_us_bdl_arctic/wav/arctic_a0001.wav, arctic_a0002.wav is http://festvox.org/cmu_arctic/cmu_arctic/cmu_us_bdl_arctic/wav/arctic_a0002.wav, and the dictionary is the single line:
of AH V
then the current output is:
arctic1: [('<s>', 1), ('of', 1), ('of', -12001), ('<sil>', 0), ('of', -16211), ('<sil>', -1205), ('of', -13991), ('of', 0), ('<sil>', 0), ('of', -31232), ('</s>', 0)]
arctic2: [('<s>', -3), ('[SPEECH]', -725), ('<sil>', -1), ('[SPEECH]', -6), ('<sil>', -20), ('of', -6162), ('[SPEECH]', -397), ('</s>', 0)]
but if we switch them, the output becomes
arctic2: [('<s>', 0), ('of', 0), ('<sil>', 0), ('of', -29945), ('<sil>', -20), ('of', -26004), ('of', 0), ('of', 0), ('<sil>', 0), ('of', -84868), ('of', -35690), ('</s>', 0)]
arctic1: [('<s>', -3), ('of', -14886), ('of', -30237), ('<sil>', 0), ('of', -22103), ('of', 1), ('<sil>', 0), ('of', -30795), ('of', -65040), ('</s>', 0)]
so the outputs of arctic1 and arctic2 depend on the order. Furthermore, if we use arctic1 twice, the output is
[('<s>', 1), ('of', 1), ('of', -12001), ('<sil>', 0), ('of', -16211), ('<sil>', -1205), ('of', -13991), ('of', 0), ('<sil>', 0), ('of', -31232), ('</s>', 0)]
[('<s>', 1), ('of', -24424), ('of', -24554), ('<sil>', 2), ('[SPEECH]', -37257), ('of', -37008), ('<sil>', -461), ('of', -20422), ('of', 0), ('<sil>', 0), ('of', -3570), ('[SPEECH]', -42), ('</s>', 0)]
Maybe it is a problem with me not using start_stream()? I am not sure how I should use it. Even if I use decoder.start_stream() (directly before decoder.start_utt()), the output is different - it becomes
[('<s>', 1), ('of', 1), ('of', -12001), ('<sil>', 0), ('of', -16211), ('<sil>', -1205), ('of', -13991), ('of', 0), ('<sil>', 0), ('of', -31232), ('</s>', 0)]
[('<s>', -2), ('of', -33113), ('of', -29715), ('<sil>', 1), ('[SPEECH]', -37258), ('of', -37009), ('<sil>', -461), ('of', -20422), ('of', 0), ('<sil>', 0), ('of', -3570), ('[SPEECH]', -42), ('</s>', 0)]
If you want the entire log, here (http://pastebin.com/2dNeyS1x) is the log for arctic1 before arctic2, and here (http://pastebin.com/Nkvj2G0g) is the log for arctic2 before arctic1, while here is the log for arctic1 two times in a row with start_stream (http://pastebin.com/HWq6j7X2), and here is the log for arctic1 two times in a row without start_stream (http://pastebin.com/MsadW4nh).