1

I am studying how HTK Tools works with handwriting recognition. Following the ICFHR–2010 TUTORIAL I run examples for "Spanish-Numbers" corpus and received the resulting HMMs (files stored in folder hmm and listed in HMMsList), and res32.mlf with results of recognition received with HVite. Also I have master label file SamplesRef.mlf. And now I want to see recognition results statistics, i.e. studying HResults tool.

When I run HResults as

 HResults -I SamplesRef.mlf HMMsList res32.mlf

I see

====================== HTK Results Analysis =======================
  Date: Tue Mar 31 15:21:11 2015
  Ref : SamplesRef.mlf
  Rec : res32.mlf
------------------------ Overall Results --------------------------
 SENT: %Correct=0.00 [H=0, S=2, N=2]
 WORD: %Corr=77.78, Acc=77.78 [H=7, D=0, S=2, I=0, N=9]
===================================================================

But if I add option -p in order to have confusion matrix I see the following error message:

~/icfhr$ HResults -p -I SamplesRef.mlf HMMsList res32.mlf
 ERROR [+3331]  Index: Label millones not in list[0 of 19]
FATAL ERROR - Terminating program HResults

I understand that message means that there is no HMM with name "millones" and I found that in my res32.mlf samples looks like:

"’*’/210341.rec"
mil
seiscientos
cincuenta
y
siete
millones
.

If I change res32.mlf with text editor to res33.mlf with content like:

"’*’/210341.rec"
m
i
l
s
e
i
s
c
i

... and so on.

And use samples.mlf (instead of SamplesRef.mlf) which inside looks like:

"*/210341.lab"
m
i
l
@
q
u
i
n
i
e
n
t
o
s
@
c

... and so on.

I have the desired result:

~/icfhr$ HResults -p -I samples.mlf HMMsList res33.mlf
====================== HTK Results Analysis =======================
  Date: Tue Mar 31 15:35:42 2015
  Ref : samples.mlf
  Rec : res33.mlf
------------------------ Overall Results --------------------------
SENT: %Correct=0.00 [H=0, S=2, N=2]
WORD: %Corr=79.63, Acc=77.78 [H=43, D=5, S=6, I=1, N=54]
------------------------ Confusion Matrix -------------------------
       a   c   d   e   i   l   m   n   o   s   t   u   v   y  Del [ %c / %e]
   @   0   0   0   0   0   1   1   0   0   0   0   0   0   0    5 [ 0.0/3.7]
   a   2   0   0   0   0   0   0   0   0   0   0   0   0   0    0
   c   0   2   0   0   0   0   0   0   0   0   0   0   0   0    0
   d   0   0   1   0   0   0   0   0   0   0   0   0   0   0    0
   e   0   0   0   6   0   0   0   0   0   0   0   0   0   0    0
   i   0   0   0   0   6   0   0   0   0   0   0   0   0   0    0
   l   0   0   0   0   0   3   0   0   0   0   0   0   0   0    0
   m   0   0   0   0   0   0   2   0   0   0   0   0   0   0    0
   n   0   1   0   0   0   0   0   6   0   0   0   0   0   0    0 [85.7/1.9]
   o   0   0   0   0   0   0   0   0   4   0   0   0   0   0    0
   q   0   0   0   0   0   0   0   0   0   1   0   0   0   0    0 [ 0.0/1.9]
   s   0   0   0   0   0   0   0   0   0   4   0   0   0   0    0
   t   0   0   0   0   0   0   0   0   0   0   4   0   0   0    0
   u   0   0   0   1   0   0   0   0   0   0   0   1   0   0    0 [50.0/1.9]
   v   0   0   0   0   0   0   0   0   0   0   0   0   1   0    0
   y   0   0   0   0   1   0   0   0   0   0   0   0   0   1    0 [50.0/1.9]
Ins    0   0   0   0   0   0   0   0   0   1   0   0   0   0
===================================================================

So, the main question is:

What is the simplest way (without text editor) to make mlf-files adapted for making confusion matrix?

(I suppose I miss some option of some HTK tool… but which tool and which option?)

Any useful ideas would be highly appreciated.

VolAnd
  • 6,367
  • 3
  • 25
  • 43

1 Answers1

0

In order to use the -p option, you need to provide the labels list of the classes not your HMMs, (i.e. if you're trying to recognize the words Yes, No, Never) then your "HMMsList" file should be written as:

Yes
No
Never

Regardless of the HMMs that actually constitutes the words. Your "HMMsList" file should be "LabelsList"

Asmaa Rabie
  • 71
  • 1
  • 3