I have a PDF
file that includes a table and I want to convert it into table structured data.
My PDF
file includes a pretty complex table which makes most tool insufficient. For example,
I tried to use the following tools and they didn't extract it well: AWS Textract
, Google AI Document
, Google Vision
, Microsoft Text Recognition
.
Actually, Google AI Document
managed to do about 70% correct but it is not good enough.
So, I searched for a way to customize train model, so that when extracting this table, it will extract it properly. I tried Power Apps AI Builder and Google AutoML
Entity Extraction, but both of them didn't help (BTW, I wasn't what AutoML's purpose, is it for prediction or also possible to customize table extraction?).
I would like to know which tools are good for my use case and if there is any (AI) tool that I can use to train these kind of tables, so that the text extraction will be better.