Questions tagged [amazon-textract]

Amazon Textract enables document text detection and analysis in applications. The Amazon Textract Text Detection API can detect text in a variety of documents including financial reports, medical records, and tax forms. For documents with structured data, you can use the Amazon Textract Document Analysis API to detect linked text, tables, option buttons (radio buttons), and check boxes.

Amazon Textract documentation

226 questions
0
votes
1 answer

Extract Business Related data from Invoice using aws textract

Actually we need to extract details from the document like Invoice/delivery Challan etc. So I was going through aws Textract demo version where we can simply upload the PDF document and see, what all details it is extracting as key value pair, Table…
Kavita
  • 1
  • 2
0
votes
3 answers

Not receiving a message to Amazon SNS from Textract

I am using Amazon Textract's StartDocumentAnalysis function to asynchronously scan a .pdf file from the S3 bucket. As the documentation says, I should receive a notification about the job status to the provided SNS topic. StartDocumentAnalysis…
Whizzil
  • 1,264
  • 6
  • 22
  • 39
0
votes
1 answer

Access control for AWS Managed services

Our organization is planning to use AWS Managed services like Rekognition, Textract etc. Since these services uses S3 buckets for Face comparison and analyzing documents. The concern is end users shouldn't be able to access buckets outside our…
0
votes
1 answer

Amazon Textract skip some form fields when doing analysis

I am calling Amazon Textract api for analyze pdf scanned image and it skipped some fields as key value pair. Is there any way to train or specifically point to map key value pair properly?
arvind
  • 106
  • 7
0
votes
1 answer

How to get the Character Level Data from Amazon Textract?

I'm trying to use Amazon Textract to perform OCR to build a small application. I'm trying to find a way to get the character co-ordinates from each word. Is there any way I can find the character level coordinates/character data?
0
votes
1 answer

Does AWS Textract support Hindi text in a png file?

I need to do OCR on images that contain text in Hindi, Marathi, Malayalam, etc languages. I am using AWS Textract API in the python script, but OCR on Scanned Hindi text Document gives a response with the incorrect English like words. Does AWS…
0
votes
1 answer

Amazon Textract Bounding box co-ordinates changing for a particuleat block

I am using Amazon Textract for Text detecting or Raw text, forms and Tables. I am uploading a PDF for that. I am using co-ordinates to get the value from raw text. I was successful in getting the value. But after some days, the bounding box…
OCR
  • 1
  • 1
0
votes
1 answer

How to highlight custom extractions using a2i's crowd-textract-analyze-document?

I would like to create a human review loop for images that undergone OCR using Amazon Textract and Entity Extraction using Amazon Comprehend. My process is: send image to Textract to extract the text send text to Comprehend to extract entities find…
0
votes
1 answer

Using AWS Textract for processing PDF

I want to use Textract OCR service for reading text from pdf file. I have a problem with that because I want to do it locally, without S3 bucket. I tested it for image files and it works good, but it does not work for PDF files. This is the code…
taga
  • 3,537
  • 13
  • 53
  • 119
0
votes
1 answer

Unable to start a human loop using Augmented AI - Error in start_human_loop

I am trying to trigger a human workflow through a piece of python code. This is to include Human Review for Textract. The code snippet is as below: sentiment = "Neutral" blurb = "The sentiment of this document is neutral" response =…
Amita PM
  • 43
  • 4
0
votes
0 answers

Internal error Unable to get object metadata from S3. Check object key, region and/or access permissions in aws Textract awssdk.core

I am trying to run the Document analysis request with the use of an S3 bucket, but it is giving me an internal error. I extracted table values from a document. Here is my code. Please note and using the AWS SDK for .Net. public async…
0
votes
1 answer

How can I use AWS Textract with Python

I have tested almost every example code I can find on the Internet for Amazon Textract and I cant get it to work. I can upload and download a file to S3 from my Python client so the credentials should be OK. Lots of the errors points to some region…
user2856066
  • 821
  • 1
  • 11
  • 20
0
votes
0 answers

AWS Getting key error on AWS Textract code. What should I do?

this is the error I am getting from the logs: [ERROR] KeyError: 'Text' Traceback (most recent call last): File "/var/task/lambda_function.py", line 51, in lambda_handler pdfText += item["Text"] + '\n' I am trying to run a form analysis via…
0
votes
1 answer

AWS Python CORS header

I'm trying to use Amazon Textract but upon my API call it says allow-access-origin-header not present and makes the API not work. I have taken steps to see that the API itself does work but I can't use this to deploy to customers who want to use the…
ad stefnum
  • 78
  • 6
0
votes
1 answer

Detect and analyze text using Amazon Textract from a multi page document PDF synchronously

Answer https://stackoverflow.com/a/62174368/8117673 Further question is - will it affect the accuracy of text detection by Amazon Textract? Do I need to pre-process the image to get better result from Amazon Textract?