-1

I have a Lambda function that's used to verify the integrity of a file, but for simplicity sake, let's just assume that it copies a file from a bucket into a bucket upon trigger (lambda gets triggered when it detects file ingestion).

The problem that I am currently facing is that whenever I ingest a lot of files, the function gets triggered as many times as the number of files and that leads to unnecessary invocations. One invocation can handle multiple files so the earlier invocations typically process more than 1 file and the latter realising that there are no more files to process, yield a NoSuchKey error.

So I added in the try-except logic and indicate it to log NoSuchKey error as an info log so that it does not trigger another function that catches ERROR keyword and alert prod support team.

To do that, i used this code which is taken from AWS documentation:

import botocore
import boto3
import logging

# Set up our logger
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger()

client = boto3.client('s3')

try:
    logger.info('Executing copy statement')
    # main logic

except botocore.exceptions.ClientError as error:
    if error.response['Error']['Code'] == 'NoSuchKey':
        logger.info('No object found - File has been moved')
    else:
        raise error

https://boto3.amazonaws.com/v1/documentation/api/latest/guide/error-handling.html?force_isolation=true

Despite that, I still get the NoSuchKey error. From CloudWatch, it seems that the logic in the "try" section gets executed, followed by an ERROR statement and then the info log 'No object found - File has been moved' that's written in my "except" section.

I thought either "try" or "except" gets executed, but in this case both seemed to run.

Any pros willing to shed some lights?

E_net4
  • 27,810
  • 13
  • 101
  • 139
Kewei
  • 15
  • 7
  • *"One invocation can handle multiple files so the earlier invocations typically process more than 1 file"* - why can an early lambda handle multiple files? The lambda *should* only handle the file / files it was invoked for. – luk2302 Feb 13 '23 at 09:33
  • 1
    Apart from that the code should work, see also https://stackoverflow.com/a/42978638/2442804 . – luk2302 Feb 13 '23 at 09:35
  • 1
    luk2302 The logic requires a pair of files (csv and ctl file) to be present in order to proceed. So its common that the earlier invocation fails to find the pair (since one gets ingested first) and output a "missing pair" log, and then the second invocation process both of them. There is also a case (i am guessing) where the second file gets in quicker than the first lambda get triggered, hence the pair is able to catch the first invocation, leaving the second invocation having nothing to process. – Kewei Feb 13 '23 at 09:38
  • Makes sense. What you obviously risk is that the first invocations actually starts properly processing and the second invocation comes in while the first one is still working on stuff. – luk2302 Feb 13 '23 at 09:40
  • luck2302 thats one possibility. I cant tell what actually happened under the hood but I guess that's what happened. Anyway, let me try the other code from the other post that you shared. Hopefully the "except s3.meta.client.exceptions.NoSuchKey:" solves the issue. – Kewei Feb 13 '23 at 09:43
  • No luck with "except s3.meta.client.exceptions.NoSuchKey:" – Kewei Feb 13 '23 at 09:53
  • Your post is missing some critical code. What AWS operation(s) are you executing in the `try` arm? – jarmod Feb 13 '23 at 13:17
  • @jarmod it copies object from one path to the other and deletes the old file, basically moving a file from one place to another. Based on CloudWatch log, the error originated here thus I didnt include the whole logic. – Kewei Feb 14 '23 at 02:21

1 Answers1

0

Found the solution to this:

import botocore
import boto3
import logging
# Set up our logger
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger()
client = boto3.client('s3')
try:
    logger.info('Executing copy statement')
    # main logic
except botocore.exceptions.ClientError as error:
    if error.response['ResponseMetadata']['HTTPStatusCode'] == 404:
        logger.info('No object found - File has been moved')
    else:
        raise error

So the reason why NoSuchKey error continue to pop up despite earlier handling effort is because the same error can be returned in two forms by two different sources.

  • The "NoSuchKey" error is a specific error message that is returned by the S3 service when it cannot find the specified object. This error message is returned directly by the S3 service, and is not wrapped in an AWS SDK-specific error message.

  • The "ClientError: An error occurred (404) when calling the HeadObject operation: Not Found" error is a more generic error message that is returned by the AWS SDK for Python (Boto3) when it receives a 404 response from the S3 service. This error message is a wrapper around the actual error message returned by the S3 service.

My earlier code handled only one of them. Using error.response['ResponseMetadata']['HTTPStatusCode'] == 404, I am able to handle both of them.

Kewei
  • 15
  • 7