Questions tagged [amazon-textract]

Amazon Textract enables document text detection and analysis in applications. The Amazon Textract Text Detection API can detect text in a variety of documents including financial reports, medical records, and tax forms. For documents with structured data, you can use the Amazon Textract Document Analysis API to detect linked text, tables, option buttons (radio buttons), and check boxes.

Amazon Textract documentation

226 questions
0
votes
1 answer

Create_Failed S3BatchProcessor, AWS Lambda

I am running cdk deploy in my textract pipline folder for large document processing. However, when i run this porgram I get this error The error | CREATE_FAILED | AWS::Lambda::Function | S3BatchProcessor6C619AEA Resource handler…
Dr.V
  • 1
  • 1
0
votes
2 answers

QuerieConfig not getting identified by AWS

I am running this code: response = textract.start_document_analysis( DocumentLocation={ 'S3Object': { 'Bucket': bucketname, 'Name': filename } }, FeatureTypes= ['QUERIES'], QueriesConfig={'Queries':[ {'Text':'{}'.format("Who are you?")} ]},…
0
votes
1 answer

AWS Textract with Angular TS. I am trying to connect aws textract with angular through aws-sdk/client-textract npm package.But I get Credentialerror

Here is my app.component.ts.I have imported aws-sdk/client-textract in app.component.ts and given region of my textract I donot know where to give my access_key and secret_key and also what are parameters to be passed for textract.If any one can…
0
votes
0 answers

How to detect strikethroughs in Amazon Textract

I have some documents with deletions (strikethroughs) and I need to detect if there is deletions in a document or not. Example: A pupil has written an assignment and mistakenly misspelled a word. So he put a line over the word to show the word is…
0
votes
2 answers

Return value of the Lambda can be seen by response of API gateway?

I'm trying to implement the following structure on my AWS account. I'll upload a picture to my s3 bucket with a put/post request and process the following picture with Textract. API call for the Textract will be handled by Lamda. So here is the…
0
votes
1 answer

Moving away from simple regex extraction to NER?

We have a relatively "simple" project from the business: digitize some contracts scan (PDF files) with OCR and extract entities from the text. Entities can be something as simple as a specific price located in a certain subsection of the contract,…
0
votes
1 answer

Python-Textract-Boto3 - Trying to pass result of a method call as an argument to the same method, and loop

I have a mulit-page pdf on AWS S3, and am using textract to extract all text. I can get the response in batches, where the 1st response provides me with a 'NextToken' that I need to pass as an arg to the get_document_analysis method. How do I avoid…
Prolle
  • 358
  • 1
  • 10
0
votes
0 answers

How to get S3 object from AWS-Textract when called by Lambda in a private Subnet?

TL;DR I'm working on this solution, have PDF in S3 and can call Textract manually (using boto3 from my own laptop). When called from Lambda (private VPC) I get this error: An error occurred (InvalidS3ObjectException) when calling the…
0
votes
2 answers

Aws textract form design best practices

I’m currently redesigning documents and forms for improving the ease of extraction using Aws textract. Do you have experiences and best practices to share? Regards
0
votes
1 answer

Async Textract in AWS Lambda

How does this architecture handle a large backlog of pdfs to be processed by AWS Textract? If there's a large backlog of messages in the first queue, the first lambda (scheduled to run every x minutes) would start picking up messages to call and…
0
votes
2 answers

Is it possible to call a lambda function at the end of Textract processing

Is it possible to call a lambda function at the end of some AWS Textract processing?
Ricardo Peres
  • 13,724
  • 5
  • 57
  • 74
0
votes
1 answer

AWS textract - start expense analysis

While implementing aws textract Analyse Expense asynchronous api using boto3 for python, I'm getting error as 'Textract' object has no attribute 'start_expense_analysis'. on the other hand, start_document_text_detection is working fine for me in the…
codeq09
  • 1
  • 1
0
votes
0 answers

An error occurred (InvalidParameterException) when calling the DetectDocumentText operation: Request has invalid parameters

Textract detect_document_text method from boto3 package is erroring out in my local machine only on images of a certain size. It responds with a super cryptic message, so I am at a loss how to debug this further. The documentation doesn't indicate…
Jeremy
  • 1,717
  • 5
  • 30
  • 49
0
votes
1 answer

Convert image/jpeg file sent as image/jpeg via REST API to AWS Lambda into a usable format

CURRENTLY I am trying to send an image file to Lambda via API, such that it can be onsent to Textract. Textract requires the file to be converted to "bytes", which I successfully achieved with this code: https://stackoverflow.com/a/69868101/1730260…
0
votes
0 answers

Encountering error "Request has invalid Client token" but my credentials is correct

I'm having this error when I call the startDocumentTextDetection even I have the correct credentials because I was able to upload the my pdf in s3. I also allowed Textract to call other services on my behalf. $client = new TextractClient([ …
Albert
  • 327
  • 1
  • 3
  • 16