0

I am annotating Borrower Name "Borrower Name" -> BorrowerNameKeyword ( "label" = "Borrower Name"); But I get this text post OCR analysis. At times I might get Borrower Name as B0rr0wer Nane. Is this possible to set tolerance limit so that this text gets annotated as BorrowerNameKeyword?

Is their any other approach which could help here? I could think of dictionary correction but that wont help as it could auto correct right words.

1 Answers1

1

You could achieve that with regular expressions in UIMA Ruta. For you particular example the following rule should work:

"B.rr.wer\\sNa.e" -> BorrowerName;

Likewise, you can create more variants of regular expressions to cover the OCR errors.

Viorel Morari
  • 537
  • 3
  • 10