I have been looking for a way to segment text from an image, specifically an image of a receipt. The problem I'm facing is that receipts have different layouts, except they always contain a table containing product names, product prices and the amount of the product.
How should I tackle this with the use of TensorFlow?
Of course I have done some research on my own, before asking this. This is how I would tackle it:
- Train a model to segment the receipt, so I can extract the product table.
- Cut out the table from the image.
- Using Tesseract OCR to get the raw text.
- Use another model to segment the text for product name, price and amount.
Image of receipt with highlighted segmentation
Another image of a receipt but with a different table layout