I am currently working on scanning invoices with OCR scanning. All invoices use the "OCRB" font, and have the same formatting.
The bottom of a sample invoice looks like this
This is what the user needs to scan.
I have tried many different libraries to detect what I want. But most libraries doesn't give me the correct result. The best result came from Firebase ML Vision text recognition. But the resulting output I get is this:
I can calculate if the values are correct, except for the amount, presented in the middle. In this case it's presented as "3557 00" but if the user moves the camera a bit further to the right, the result I get is "557 00". Since both MLKit and other libraries cuts around the word, I have no idea if the full sum is presented or not.
If I would get a single space before the word, I could get that there is a full "word", in this case a sum.
Anyone has any ideas of how what library to use to get the best result?