3

So far my Textract tests are very impressive for handwriting, but I see sometimes it fails to recognise some forms and some values. Is it possible to train it? If I'm scanning the same type of form/document it will be very useful to amend the results and teaching it where the boundaries of some form elements lie and some key-value associations as well?

It will be a real deal breaker for the kind of service I'm trying to design.

Thanks in advance.

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470

2 Answers2

1

No. It is not possible to 'train' Amazon Textract.

The available actions are limited to analysing a document and detecting text.

See: Actions - Amazon Textract

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
  • Thanks You Sir. Is there any other options in AWS where In can create my own Textract and train it. – Syed Kounain Abbas Rizvi Jun 19 '21 at 09:19
  • [Amazon SageMaker](https://docs.aws.amazon.com/sagemaker/latest/dg/whatis.html) is a machine learning service where you can train models, but it's a very complex service and probably isn't what you are seeking. – John Rotenstein Jun 19 '21 at 10:37
1

I know this is an old post but I am working on a project to do exactly this. You can look at this Hugging Face model and the referenced model in Github: https://huggingface.co/docs/transformers/model_doc/layoutlmv2

It isn't simple but it's the only open source solution I know of.

G. Casey
  • 59
  • 3