I am in a need to measure the word error rate (WER) given an Automatic Speech Recognition (ASR) transcript and a ground truth transcript.
Searching for a program to help me with this need I found this repo on github that I tried and its calculates what I needed.
As far as I understand this program is built to run in a terminal, in fact that was how I tried and worked.
Now the thing is that I want to run the logic of this program for a group of files that I have, so in order to achieve that I wrote some code in a jupyter notebook that imports the module used and calculates the WER of each couple of files being in a directory of folders.
The problem that I facing now is that the code runs but I having wrong metrics, whereas I had the correct metrics running from the terminal.
Can someone please help understand if this problem could be related to how argparse works or if it can be related to how the asr_evalution modules manage the variables.
Please some hints on how I can debug or fix this problem.
Being that said, I have and did the following:
- Directory of files and folders used
|-- ASR
| |--1
| |. ref_groundTruth.txt
| |. hyp_fakeASR.txt
| |
| |--2
| |. ref_groundTruth.txt
| |. hyp_groundTruth.txt
Because the imported module expects an argparse object I copy/paste from the repo the lines that creates the parser (create_parser)
Results when I ran the program in the terminal vs in jupyter with the code developed terminal's output & jupyter's output
The text used for ref_groundTruth.txt, hyp_groundTruth.txt are the following:
Esto es un texto de prueba para utilizar la libreria ASR, luego de validar, paso al siguiente nivel
- The text used for hyp_fakeASR.txt is the following:
Esto es un text para test para utilizar la libreria ASR, luego de validar, paso al siguiente nivel
- The developed code is
# Import libraries
from asr_evaluation.asr_evaluation import *
import argparse
from os import walk
# Let's define the parse structure to invoke every time we need past to the module asr_evaluation
def create_parser():
parser = argparse.ArgumentParser(description='Evaluate an ASR transcript against a reference transcript.')
parser.add_argument('ref', type=argparse.FileType('r'), help='Reference transcript filename')
parser.add_argument('hyp', type=argparse.FileType('r'), help='ASR hypothesis filename')
print_args = parser.add_mutually_exclusive_group()
print_args.add_argument('-i', '--print-instances', action='store_true',
help='Print all individual sentences and their errors.')
print_args.add_argument('-r', '--print-errors', action='store_true',
help='Print all individual sentences that contain errors.')
parser.add_argument('--head-ids', action='store_true',
help='Hypothesis and reference files have ids in the first token? (Kaldi format)')
parser.add_argument('-id', '--tail-ids', '--has-ids', action='store_true',
help='Hypothesis and reference files have ids in the last token? (Sphinx format)')
parser.add_argument('-c', '--confusions', action='store_true', help='Print tables of which words were confused.')
parser.add_argument('-p', '--print-wer-vs-length', action='store_true',
help='Print table of average WER grouped by reference sentence length.')
parser.add_argument('-m', '--min-word-count', type=int, default=1, metavar='count',
help='Minimum word count to show a word in confusions (default 1).')
parser.add_argument('-a', '--case-insensitive', action='store_true',
help='Down-case the text before running the evaluation.')
parser.add_argument('-e', '--remove-empty-refs', action='store_true',
help='Skip over any examples where the reference is empty.')
return parser
# My path is the folder in which the transcripts ASR are and the ground truth
mypath = './ASR'
#Let's define some macro variables
cnt = 1
prefix_ref = 'ref_'
prefix_hyp = 'hyp_'
# With OS walk we are going to find all the folders related to the transcripted audio which containt \
# the ASR and ground Truth text files
for (dirpath, dirnames, filenames) in walk(mypath):
print(f'-------{cnt}------')
if len(filenames) > 1:
groundTruth = dirpath + '/' + ''.join([word for word in filenames if word.startswith(prefix_hyp)])
fakeASR = dirpath + '/' + ''.join([word for word in filenames if word.startswith(prefix_ref)])
parser = create_parser()
args = parser.parse_args([groundTruth, fakeASR])
main(args)
cnt+=1