I wish to extract key-value pairs from the following image that consists of 2 invoices.
Image example
I am using AWS Textract to achieve this however I'd like to be able to map the key-value pairs back to the invoices. For ex- 'Cornbread SVC' should be mapped to bill #1 and '1 #1 CHKN PLATE' should be mapped to bill #2.
One approach I thought was to perform some pre-processing on the image in which if we could find out the no. of bills and their coordinates then crop the image as per the dimensions. So basically '5' bills on an image would yield the coordinates of '5' bills and then take the original image and crop it 5 times as per the different bill dimensions. And then send each bill as a separate image to AWS Textract.
However, I have not been to able to figure out a method to detect the no. of bills in an image and it's boundary coordinates.
Any help would be appreciated. I am open to using any other APIs or methods to achieve this.