Amazon Textract enables document text detection and analysis in applications. The Amazon Textract Text Detection API can detect text in a variety of documents including financial reports, medical records, and tax forms. For documents with structured data, you can use the Amazon Textract Document Analysis API to detect linked text, tables, option buttons (radio buttons), and check boxes.
Questions tagged [amazon-textract]
226 questions
0
votes
1 answer
Create_Failed S3BatchProcessor, AWS Lambda
I am running cdk deploy in my textract pipline folder for large document processing. However, when i run this porgram I get this error
The error
| CREATE_FAILED | AWS::Lambda::Function | S3BatchProcessor6C619AEA
Resource handler…

Dr.V
- 1
- 1
0
votes
2 answers
QuerieConfig not getting identified by AWS
I am running this code:
response = textract.start_document_analysis( DocumentLocation={
'S3Object': { 'Bucket': bucketname, 'Name': filename } }, FeatureTypes= ['QUERIES'], QueriesConfig={'Queries':[ {'Text':'{}'.format("Who are
you?")} ]},…

user19624884
- 1
- 1
0
votes
1 answer
AWS Textract with Angular TS. I am trying to connect aws textract with angular through aws-sdk/client-textract npm package.But I get Credentialerror
Here is my app.component.ts.I have imported aws-sdk/client-textract in app.component.ts and
given region of my textract I donot know where to give my access_key and secret_key and also what are parameters to be passed for textract.If any one can…
0
votes
0 answers
How to detect strikethroughs in Amazon Textract
I have some documents with deletions (strikethroughs) and I need to detect if there is deletions in a document or not. Example: A pupil has written an assignment and mistakenly misspelled a word. So he put a line over the word to show the word is…

dontknowguy
- 63
- 5
0
votes
2 answers
Return value of the Lambda can be seen by response of API gateway?
I'm trying to implement the following structure on my AWS account. I'll upload a picture to my s3 bucket with a put/post request and process the following picture with Textract. API call for the Textract will be handled by Lamda. So here is the…

hburak_06
- 25
- 9
0
votes
1 answer
Moving away from simple regex extraction to NER?
We have a relatively "simple" project from the business: digitize some contracts scan (PDF files) with OCR and extract entities from the text.
Entities can be something as simple as a specific price located in a certain subsection of the contract,…

Droid
- 441
- 1
- 3
- 18
0
votes
1 answer
Python-Textract-Boto3 - Trying to pass result of a method call as an argument to the same method, and loop
I have a mulit-page pdf on AWS S3, and am using textract to extract all text. I can get the response in batches, where the 1st response provides me with a 'NextToken' that I need to pass as an arg to the get_document_analysis method.
How do I avoid…

Prolle
- 358
- 1
- 10
0
votes
0 answers
How to get S3 object from AWS-Textract when called by Lambda in a private Subnet?
TL;DR
I'm working on this solution, have PDF in S3 and can call Textract manually (using boto3 from my own laptop).
When called from Lambda (private VPC) I get this error:
An error occurred (InvalidS3ObjectException) when calling the…

JuanMatias
- 87
- 1
- 1
- 10
0
votes
2 answers
Aws textract form design best practices
I’m currently redesigning documents and forms for improving the ease of extraction using Aws textract.
Do you have experiences and best practices to share?
Regards

dontknowguy
- 63
- 5
0
votes
1 answer
Async Textract in AWS Lambda
How does this architecture handle a large backlog of pdfs to be processed by AWS Textract? If there's a large backlog of messages in the first queue, the first lambda (scheduled to run every x minutes) would start picking up messages to call and…

sprint5
- 1
- 1
0
votes
2 answers
Is it possible to call a lambda function at the end of Textract processing
Is it possible to call a lambda function at the end of some AWS Textract processing?

Ricardo Peres
- 13,724
- 5
- 57
- 74
0
votes
1 answer
AWS textract - start expense analysis
While implementing aws textract Analyse Expense asynchronous api using boto3 for python, I'm getting error as
'Textract' object has no attribute 'start_expense_analysis'.
on the other hand, start_document_text_detection is working fine for me in the…

codeq09
- 1
- 1
0
votes
0 answers
An error occurred (InvalidParameterException) when calling the DetectDocumentText operation: Request has invalid parameters
Textract detect_document_text method from boto3 package is erroring out in my local machine only on images of a certain size. It responds with a super cryptic message, so I am at a loss how to debug this further. The documentation doesn't indicate…

Jeremy
- 1,717
- 5
- 30
- 49
0
votes
1 answer
Convert image/jpeg file sent as image/jpeg via REST API to AWS Lambda into a usable format
CURRENTLY
I am trying to send an image file to Lambda via API, such that it can be onsent to Textract.
Textract requires the file to be converted to "bytes", which I successfully achieved with this code: https://stackoverflow.com/a/69868101/1730260…

Wronski
- 1,506
- 3
- 18
- 37
0
votes
0 answers
Encountering error "Request has invalid Client token" but my credentials is correct
I'm having this error when I call the startDocumentTextDetection even I have the correct credentials because I was able to upload the my pdf in s3. I also allowed Textract to call other services on my behalf.
$client = new TextractClient([
…

Albert
- 327
- 1
- 3
- 16