-1

EDIT: Added test method.
EDIT: Added sample code and more detailed explination of the problem.

How do I prevent a Python 3.8 AWS Lambda from logging an unhandled exception?

We have a Python 3.8 AWS Lambda function that is only invoked directly, not via API Gateway or any other AWS service. We have custom logging that formats it correctly for a third party service to consume.

Simplified sample code

custom_logging.py

import json
import logging
from contextlib import contextmanager
from typing import Generator, List

metadata_stack: List[dict] = [{}]

@contextmanager
def add_metadata(metadata: dict) -> Generator[None, None, None]:
    metadata_stack.append({ **metadata_stack[-1], **metadata})
    yield
    metadata_stack.pop()

class CustomFormatter(logging.Formatter):
    def formatMessage(self, record: logging.LogRecord) -> str:
        record.message = json.dumps(
            {
                'status': record.levelname,
                'message': record.message,
                'metadata': metadata_stack[-1],
            },
            separators=(',',':'),
            default=str,
        )
        return super().formatMessage(record)

def _init_root_logger():
    logger = logging.root
    logger.setLevel(logging.DEBUG)
    if logger.hasHandlers():
        handler = logger.handlers[0]
        class_name = handler.__class__.__name__
        if (class_name == 'LambdaLoggerHandler'):
            handler.setFormatter(CustomFormatter())
            logger.info('Identified AWS Lambda logger')
        else:
            logger.info('Unidentified logger handler: ' + class_name)
    else:
        logger.info('Identified no logger. Assuming logging to console.')        

_init_root_logger()

lambda-handler.py

import logging
import custom_logging

class LambdaHandler:
    def __init__(self, logger):
        self.logger = logger

    def handle(self, event, context):
        with custom_logging.add_metadata({ 'foo': 'bar' }):
            try:
                self.logger.info('Lambda initialised.')
                self.main()
            except:
                self.logger.exception('An error occurred.')
                raise
    
    def main(self):
        raise ValueError('I have no values.')

def handle(event, context):
    logger = logging.getLogger('LambdaHandler')
    return LambdaHandler(logger).handle(event, context)

Execution

When this lambda has an exception, we want it to fail. For testing, I'm executing this from the CLI with the command:

aws lambda invoke --function-name mylambda response.json &&
cat response.json | jq .

The invocation response (output of first command) is:

{
    "StatusCode": 200,
    "FunctionError": "Unhandled",
    "ExecutedVersion": "$LATEST"
}

And the returned payload (output of second command) is:

{
  "errorMessage": "I have no values.",
  "errorType": "ValueError",
  "requestId": "526834fc-4bf8-4b39-b695-6c0525aecead",
  "stackTrace": [
    "  File \"/var/task/lambda_handler.py\", line 22, in handle\n    return LambdaHandler(logger).handle(event, context)\n",
    "  File \"/var/task/lambda_handler.py\", line 12, in handle\n    self.main()\n",
    "  File \"/var/task/lambda_handler.py\", line 18, in main\n    raise ValueError('I have no values.')\n"
  ]
}

This is great, except when Python crashes, AWS logs this exception using the default AWS logger (which I suspect is stderr), not our custom one. This disrupts integration with our third party log aggregator.

Logs

This produces the following logs.

INIT_START Runtime Version: python:3.9.v23  Runtime Version ARN: arn:aws:lambda:ap-southeast-2::runtime:bf3c2cf12ad277455c3732549ac6b43321b08f1eb445e825bd941721e0415621
{
    "status": "INFO",
    "message": "Identified AWS Lambda logger",
    "metadata": {}
}
START RequestId: 526834fc-4bf8-4b39-b695-6c0525aecead Version: $LATEST
{
    "status": "INFO",
    "message": "Lambda initialised.",
    "metadata": {
        "foo": "bar"
    }
}

This next one needs some work to include the exception in the JSON, but that's for another question. It's not breaking our integration.

{
    "status": "ERROR",
    "message": "An error occurred.",
    "metadata": {
        "foo": "bar"
    }
}

Traceback (most recent call last):
  File "/var/task/lambda_handler.py", line 12, in handle
    self.main()
  File "/var/task/lambda_handler.py", line 18, in main
    raise ValueError('I have no values.')
ValueError: I have no values.

This one we want gone. It's being generated by AWS's bootstrap.py.

[ERROR] ValueError: I have no values.
Traceback (most recent call last):
  File "/var/task/lambda_handler.py", line 22, in handle
    return LambdaHandler(logger).handle(event, context)
  File "/var/task/lambda_handler.py", line 12, in handle
    self.main()
  File "/var/task/lambda_handler.py", line 18, in main
    raise ValueError('I have no values.')
END RequestId: 526834fc-4bf8-4b39-b695-6c0525aecead

Attempted Remedy

We can wrap the whole lambda function in a try/except and log it explicitly, but this removes the "FunctionError": "Unhandled" in the client response. Return a payload with "statusCode": 500 has no effect. It just makes that the returned payload.

Response:

{
  "StatusCode": 200,
  "ExecutedVersion": "$LATEST"
}

Payload:

{
  "statusCode": 500
}

This breaks out-of-the-box behaviours with AWS SDK and Step Function state machines.

How can we keep the "FunctionError" attribute without triggering an automatic error log?

Hand-E-Food
  • 12,368
  • 8
  • 45
  • 80
  • After reading the AWS Python bootstrap source code, this is looking unfeasible. It has custom string concatenation that would require overloading parts of the AWS `bootstrap.py` module, which screams fragile to me. I'm resorting to returning a `"successful": bool` in the response, and handling errors manually where it's called. – Hand-E-Food Jul 08 '23 at 03:58
  • Your terminology is making the question a bit difficult to read. What you are calling a response I would call the invocation status. What you are calling a payload is actually the invocation response. AWS Lambda has a specific thing called a "payload", and that is the input you send to the lambda function, not what the function sends you back. – Mark B Jul 09 '23 at 14:08
  • It is tough terminology given there's essentially two requests and two responses involved. I'm trying to use the terminology from the [AWS documentation](https://docs.aws.amazon.com/lambda/latest/dg/API_Invoke.html#API_Invoke_ResponseElements). – Hand-E-Food Jul 10 '23 at 00:19

1 Answers1

0

but we ideally want the "FunctionError": "Unhandled" in the client response

Then handle the error, and put that message in the response:

def lambda_handler(event, context):
  try:
    # Your current handler code goes here
  except:
    return {
        "statusCode": 500,
        "headers": {
            "Content-Type": "application/json"
        },
        "body": json.dumps({
            "FunctionError": "Unhandled"
        })
    }
Mark B
  • 183,023
  • 24
  • 297
  • 295
  • This adds `"FunctionError"` to the lambda output instead of the lambda response. – Hand-E-Food Jul 06 '23 at 23:23
  • Please explain the difference between the "lambda output" and the "lambda response". – Mark B Jul 07 '23 at 12:14
  • The response should be controlled by what the Lambda function returns. They should basically be the same thing. If they are not, then you need to describe in much more detail what "We have custom logging that formats it correctly for a third party service to consume." is doing exactly, and maybe show the code for your function, or at least how it is currently returning something in the format you need. – Mark B Jul 07 '23 at 20:01
  • When a lambda is invoked, there are two layers. There is the request from the client to AWS to invoke the lambda. Part of this request is an input payload. AWS extracts the payload from the invoke request and passes it as a parameter to the lambda handler. The lambda handler returns a result. This is wrapped up as a payload in the invoke response alongside metadata describing the invocation, which AWS returns to the client. It's that response metadata I want to be accurate while avoiding the logging. – Hand-E-Food Jul 08 '23 at 03:52
  • The key beenfit of this is the AWS SDK and other services react automatically to the invoke response. The SDK can automatically retry the lambda invocation. Step Machine state functions can catch exceptions and manage it through a different branch. I'd love to use these tools that AWS provides rather than write bespoke solutions. – Hand-E-Food Jul 08 '23 at 04:02
  • You don't have to explain to us how AWS Lambda works. What you do need to do is show the exact code you are using and the exact response you are getting, if you want us to be able to point out what needs to be changed. – Mark B Jul 08 '23 at 12:40
  • Sorry, I was trying to answer the question of the difference between the output and the response. I've thoroughly edited my question above and added sample code. – Hand-E-Food Jul 08 '23 at 23:36
  • I think the problem in your current code is you are capturing the exception and logging it, but then raising the exception again, which is passing it back to the AWS Lambda bootstrap to handle. You need to catch the exception and handle it the way you actually want it handled, and not pass it up the stack further. – Mark B Jul 09 '23 at 14:05
  • I think it is ultimately a design problem with how we're using the lambda. We want an unhandled exception, but to be logged using our log formatter. I understand now that AWS won't let us override the top-level logging format, because if the fault was in our log formatter, then we'd never see any logs. We'll switch to returning a payload with a success/failure flag and handle it manually. Thanks for your time in trying to understand our odd problem. – Hand-E-Food Jul 10 '23 at 00:22