0

I have a question regarding AWS lambda(seoul region) invokation from Airflow(Amazon Managed Workflows for Apache Airflow in tokyo region).

The problem is when I invoke a Lambda function from Airflow, the Airflow UI says the function failed.

The weird thing is when I checked if that is true from AWS log, the Lambda function at issue worked fine; the status code 200.

I heard that there is a timeout threshold for Lambda from Airflow is 5 minutes.

In fact, other Lambda functions less than 5 minutes are successful in airflow.

My question is:

  1. Is the 5 minute limit is true?
  2. If so, can I configure the limitation somewhere?

Your help would be appreciated

Below is a part of my code.

config = Config(
    read_timeout=900,
    connect_timeout=900,
    retries={"max_attempts": 1}
)
# invoke lambda func
def _invoke_lambda_func(lambda_name,payload):
    lambda_hook = AwsLambdaHook(function_name=lambda_name,config=config)
    response = lambda_hook.invoke_lambda(payload=payload)
    if response['StatusCode'] == 200:
        return True

with DAG(
dag_id='lambda_test',
    default_args = args,
    dagrun_timeout = timedelta(hours=1),
    start_date = days_ago(2),
    #schedule_interval='0 3 * * *',
    schedule_interval=None,
    tags=['lambda_test'],
) as dag:
    # Start task
    for lambda_name in lambda_func_list:
        lambda_task = PythonOperator(
            task_id=f'{lambda_name}_lambda_func',
            python_callable=_invoke_lambda_func,
            op_kwargs={
                'lambda_name':lambda_name,
                'payload':'null'
            }
        )
        lambda_func[lambda_name] = lambda_task
    test = PythonOperator(
        task_id='final_func',
        python_callable=_op_complete
    )
    lambda_func['extract_message_target_users'] >> test
shimo
  • 2,156
  • 4
  • 17
  • 21
  • [This](https://forums.aws.amazon.com/thread.jspa?messageID=965540) is not solved but similar? – shimo Aug 30 '21 at 10:42
  • Oh, yeah, exactly the same.. and I tried to do VPC peering between different regions, but it still doesn't work. – hyeonkimmm Aug 31 '21 at 02:06
  • Have you tried to invoke your Lambda directly with boto3? You could do an asychronous invocation, see https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/lambda.html#Lambda.Client.invoke – dovregubben Oct 22 '21 at 14:02

1 Answers1

0

You need to alter Hook's Boto3 connection. At that point, you can pass it a Config and you can specify the read_timeout value. This should also be possible by altering the connection via the UI, however that seems not to work. Here is code:

from __future__ import annotations

from typing import Any
from botocore.config import Config

from airflow.providers.amazon.aws.utils import trim_none_values
from cached_property import cached_property
from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
from airflow.providers.amazon.aws.operators.lambda_function import LambdaInvokeFunctionOperator as BaseLambdaInvokeFunctionOperator

class LambdaInvokeFunctionOperator(BaseLambdaInvokeFunctionOperator):
    @cached_property
    def hook(self) -> LambdaHook:
        return LambdaHook(aws_conn_id=self.aws_conn_id)

class LambdaHook(AwsBaseHook):
    def __init__(self, *args, **kwargs) -> None:
        kwargs["client_type"] = "lambda"
        super().__init__(*args, **kwargs)

    @cached_property
    def conn(self):
        # Increases the read_timeout to 900 seconds
        config_dict = {"connect_timeout": 5, "read_timeout": 900, "tcp_keepalive": True}
        config = Config(**config_dict)
        return self.get_client_type(self.region_name, config)

    def invoke_lambda(
        self,
        *,
        function_name: str,
        invocation_type: str | None = None,
        log_type: str | None = None,
        client_context: str | None = None,
        payload: str | None = None,
        qualifier: str | None = None,
    ):
        invoke_args = {
            "FunctionName": function_name,
            "InvocationType": invocation_type,
            "LogType": log_type,
            "ClientContext": client_context,
            "Payload": payload,
            "Qualifier": qualifier,
        }
        return self.conn.invoke(**trim_none_values(invoke_args))
TrendSpark
  • 27
  • 3