How do I use AWS Lambda to trigger Comprehend with S3?

Question

I'm currently using aws lambda to trigger an amazon comprehend job, but the code is only used to run one piece of text under sentiment analysis.

import boto3
def lambda_handler(event, context):
    s3 = boto3.client("s3")
    bucket = "bucketName"
    key = "textName.txt"
    file = s3.get_object(Bucket = bucket, Key = key)
    
    analysisdata = str(file['Body'].read())

    comprehend = boto3.client("comprehend")

    sentiment = comprehend.detect_sentiment(Text = analysisdata, LanguageCode = "en")
    print(sentiment)
    
    return 'Sentiment detected'

I want to run a file where each line in the text file is a new piece of text to analyze with sentiment analysis (it's an option if you manually enter stuff into comprehend), but is there a way to alter this code to do that? And have the output sentiment analysis file be placed into that same S3 bucket? Thank you in advance.

score 0 · Answer 1 · answered Jul 28 '22 at 06:09

0

It looks like you can use start_sentiment_detection_job():

response = client.start_sentiment_detection_job(
    InputDataConfig={
        'S3Uri': 'string',
        'InputFormat': 'ONE_DOC_PER_FILE'|'ONE_DOC_PER_LINE',
        'DocumentReaderConfig': {
            'DocumentReadAction': 'TEXTRACT_DETECT_DOCUMENT_TEXT'|'TEXTRACT_ANALYZE_DOCUMENT',
            'DocumentReadMode': 'SERVICE_DEFAULT'|'FORCE_DOCUMENT_READ_ACTION',
            'FeatureTypes': [
                'TABLES'|'FORMS',
            ]
        }
    },
    OutputDataConfig={
        'S3Uri': 'string',
        'KmsKeyId': 'string'
    },
    ...
)

It can read from an object in Amazon S3 (S3Uri) and store the output in an S3 object.

It looks like you could use 'InputFormat': 'ONE_DOC_PER_LINE' to meet your requirements.

answered Jul 28 '22 at 06:09

John Rotenstein

241,921
22
380
470

Thank you. I am quite new to Lambda, so could I asynchronously run this response from Lambda? – bendan Jul 28 '22 at 06:48
Yes. Just make that API call and Amazon Comprehend will do all the work. It's async, so the Lambda function can exit after it has made the call. It's a bit more complex if you want to know immediately when the work is finished (you'd need to keep polling the service). – John Rotenstein Jul 28 '22 at 06:55
Thank you! I ran into this error: "Calling the invoke API action failed with this message: The role defined for the function cannot be assumed by Lambda.". I have the roles for Comprehend and Lambda set up, but I'm not sure if there is anything that I am missing. The two policies I have are ComprehendFullAccess and AWSLambdaExecute. Is there a fix to this? Thank you! – bendan Jul 28 '22 at 21:12
The error is suggesting that the selected IAM Role was not setup for use by the AWS Lambda service. When creating an IAM Role for use by the AWS Lambda function, make sure that "AWS Lambda" is selected as the 'trusted entity'. This will create a Trust Relationship that says `"Service": "lambda.amazonaws.com"`. – John Rotenstein Jul 28 '22 at 22:59
> if you want to know immediately... [...just trigger another Lambda off the `objectCreated` event for the output `S3Uri`] – Erik Erikson May 19 '23 at 16:59

How do I use AWS Lambda to trigger Comprehend with S3?

1 Answers1