-1

I'm trying to complete lab in Machine Learning path in AWS Cloud Quest, but get an error "Your Lambda function is not detecting tables correctly"

I tried a few other ways, but it's not working, it seems I need only uncomment few lines of code and change a resource from FIELDS to TABLES, but it don't pass the test, I'm stuck.

What I'm doing wrong? Maybe someone of you guys completed this lab?

Here is my code

import json
import logging
import boto3

from trp import Document
from urllib.parse import unquote_plus

logger = logging.getLogger()
logger.setLevel(logging.INFO)

s3 = boto3.client('s3')

output_key = "output/textract_response.json"


def lambda_handler(event, context):

    logger.info(event)
    for record in event['Records']:
        bucket = record['s3']['bucket']['name']
        key = unquote_plus(record['s3']['object']['key'])
        textract = boto3.client('textract')

        try:
            response = textract.analyze_document(  
                Document={                         
                    'S3Object': {
                        'Bucket': bucket,
                        'Name': key
                    }
                },
                FeatureTypes=['TABLES',  # FeatureTypes is a list of the types of analysis to perform.
                              ])                            
                                                            
                                                            
                                                            

            doc = Document(response)  

            for page in doc.pages:
                print("Fields:")
                for field in page.form.fields:
                    print("Key: {}, Value: {}".format(field.key, field.value))

                    print("\nSearch Fields:")
                    key = "address"
                    fields = page.form.searchFieldsByKey(key)
                    for field in fields:
                        print("Key: {}, Value: {}".format(field.key, field.value))

            for page in doc.pages:
                print("\nTable details:")
                for table in page.tables:
                    for r, row in enumerate(table.rows):
                        for c, cell in enumerate(row.cells):
                            print("Table[{}][{}] = {}".format(r, c, cell.text))

            return_result = {"Status": "Success"}

            # Finally the response file will be written in the S3 bucket output folder.
            s3.put_object(
                Bucket=bucket,
                Key=output_key,
                Body=json.dumps(response, indent=4)
            )

            return return_result
        except Exception as error:
            return {"Status": "Failed", "Reason": json.dumps(error, default=str)}


Artur Uvarov
  • 75
  • 1
  • 1
  • 9
  • 1
    What is the expected result and what result is your code actually generating? – jarmod Apr 02 '23 at 15:52
  • Main problem I don't know the expected result, I'm actually getting the response with table data: print - Table details: Table[0][0] = Applicant Table[0][1] = Information Table[1][0] = Full Name: Jane Table[1][1] = Doe Table[2][0] = Phone Number: Table[2][1] = 555-0100 Table[3][0] = Home Address: Table[3][1] = 123 Any Street, Any Town, USA Table[4][0] = Mailing Address: Table[4][1] = same as home address Table[0][0] = Table[0][1] = Table[0][2] = Previous Employment Table[0][3] = History Table[0][4] = Table[1][0] = Start Date ... – Artur Uvarov Apr 04 '23 at 12:29

2 Answers2

1

I work for AWS (sales, not technical!) and have had this same issue. I raised a support ticket to check that the validation service was working correctly. I was having the same issue with validation failing, despite the cloudwatch logs showing table data - confirming the lambda code was indeed correct. Our Support team noticed there was an issue with the assignment validation for the "Extract Text from Docs" lab which has now been fixed. I just did the lab again and the validation worked, so the lab is now complete. You may want to try again now. Hope this helps. :)

Rick
  • 26
  • 1
0

does lambda test event generates the table details ? i dont see any thing obvious if your indents are correct..

  1. upload the form.png again to the input folder
  2. check the cloudwatch\loggroups\ /aws/lambda/labfunction>>> latest stream saved after adding the new .png file to the input folder.
  3. validations checks the text_response.json file for table, cells starting from line 92 under blocks..
VeeraR
  • 1
  • Table details: Table[0][0] = Applicant Table[0][1] = Information Table[1][0] = Full Name: Jane Table[1][1] = Doe Table[2][0] = Phone Number: Table[2][1] = 555-0100 Table[3][0] = Home Address: Table[3][1] = 123 Any Street, Any Town, USA Table[4][0] = Mailing Address: Table[4][1] = same as home address Table[0][0] = Table[0][1] = Table[0][2] = Previous Employment Table[0][3] = History Table[0][4] = Table[1][0] = Start Date Table[1][1] = End Date Table[1][2] = Employer Name Table[1][3] = Position Held Table[1][4] = Reason for leaving ... – Artur Uvarov Apr 04 '23 at 12:32
  • Is that output from .json file from the s3 bucket output/ ? – VeeraR Apr 04 '23 at 14:13
  • no, it's from console, I deleted *.json file, reuploaded image, and it seems like code is working to be honest I'm not understand what's going on in that file, my main problem I don't know the expected output – Artur Uvarov Apr 04 '23 at 14:42