0

I have a problem when I try to execute multiple Tasks within MWAA using POST Requests. I have been using mw1.small tier of MWAA and I schedule around 3 tasks per minute with EventBridge and Lambda. When I see my results I find that some tasks are missing and when I search for logs, I noticed that the Task was triggered but It was never scheduled or queued, and it does not appear on the tree or graph view.

I have 169 rules created on Event Bridge running a certain time everyday and I only see around 165 to 166 executions of the DAG. It is not a problem from Event Bridge or Lambda. I checked the logs for those services and all 169 DAG invocations are working fine.

The lambda function that I mentioned before triggers the DAG using a POST Request for every rule that I have on Event Bridge.

These are my configuration options that I have set.

celery.pool=1
celery.worker_autoscale=1,1
core.dag_file_processor_timeout=150
core.dagbag_import_timeout=90
core.killed_task_cleanup_time=604800
core.min_serialized_dag_update_interval=60
scheduler.dag_dir_list_interval=300
scheduler.min_file_process_interval=300
scheduler.parsing_processes=1
scheduler.processor_poll_interval=60
scheduler.schedule_after_task_execution=false

NOTE: I know I can use Step Functions but this is not an option in my case.

EDIT: This problem is caused because I have multiple parallel requests made from the lambda function. Airflow 2.2.2 uses dag_id and execution_date as a primary key for the table dag_run.

The two types of traceback that I found are:

/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/api/common/experimental/trigger_dag.py:91 DeprecationWarning: Calling `DAG.create_dagrun()` without an explicit data interval is deprecated
Traceback (most recent call last):
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
    cursor, statement, parameters, context
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "dag_run_dag_id_execution_date_key"
DETAIL:  Key (dag_id, execution_date)=(test_dag, 2023-02-16 20:19:55+00) already exists.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/airflow/.local/bin/airflow", line 8, in <module>
    sys.exit(main())
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/__main__.py", line 48, in main
    args.func(args)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/cli/cli_parser.py", line 48, in command
    return func(*args, **kwargs)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/utils/cli.py", line 92, in wrapper
    return f(*args, **kwargs)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/cli/commands/dag_command.py", line 138, in dag_trigger
    dag_id=args.dag_id, run_id=args.run_id, conf=args.conf, execution_date=args.exec_date
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/api/client/local_client.py", line 30, in trigger_dag
    dag_id=dag_id, run_id=run_id, conf=conf, execution_date=execution_date
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/api/common/experimental/trigger_dag.py", line 125, in trigger_dag
    replace_microseconds=replace_microseconds,
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/api/common/experimental/trigger_dag.py", line 91, in _trigger_dag
    dag_hash=dag_bag.dags_hash.get(dag_id),
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/utils/session.py", line 70, in wrapper
    return func(*args, session=session, **kwargs)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/models/dag.py", line 2359, in create_dagrun
    session.flush()
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2540, in flush
    self._flush(objects)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2682, in _flush
    transaction.rollback(_capture_exception=True)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
    with_traceback=exc_tb,
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
    raise exception
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2642, in _flush
    flush_context.execute()
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/unitofwork.py", line 422, in execute
    rec.execute(self)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/unitofwork.py", line 589, in execute
    uow,
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/persistence.py", line 245, in save_obj
    insert,
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/persistence.py", line 1136, in _emit_insert_statements
    statement, params
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
    return meth(self, multiparams, params)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1130, in _execute_clauseelement
    distilled_params,
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1317, in _execute_context
    e, statement, parameters, cursor, context
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1511, in _handle_dbapi_exception
    sqlalchemy_exception, with_traceback=exc_info[2], from_=e
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
    raise exception
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
    cursor, statement, parameters, context
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.IntegrityError: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "dag_run_dag_id_execution_date_key"
DETAIL:  Key (dag_id, execution_date)=(test_dag, 2023-02-16 20:19:55+00) already exists.

[SQL: INSERT INTO dag_run (dag_id, queued_at, execution_date, start_date, end_date, state, run_id, creating_job_id, external_trigger, run_type, conf, data_interval_start, data_interval_end, last_scheduling_decision, dag_hash) VALUES (%(dag_id)s, %(queued_at)s, %(execution_date)s, %(start_date)s, %(end_date)s, %(state)s, %(run_id)s, %(creating_job_id)s, %(external_trigger)s, %(run_type)s, %(conf)s, %(data_interval_start)s, %(data_interval_end)s, %(last_scheduling_decision)s, %(dag_hash)s) RETURNING dag_run.id]
[parameters: {'dag_id': 'test_dag', 'queued_at': datetime.datetime(2023, 2, 16, 20, 19, 56, 168249, tzinfo=Timezone('UTC')), 'execution_date': DateTime(2023, 2, 16, 20, 19, 55, tzinfo=Timezone('UTC')), 'start_date': None, 'end_date': None, 'state': <TaskInstanceState.QUEUED: 'queued'>, 'run_id': 'test22__2023-02-16T20:19:03+602430', 'creating_job_id': None, 'external_trigger': True, 'run_type': <DagRunType.MANUAL: 'manual'>, 'conf': <psycopg2.extensions.Binary object at 0x7fe5917cc900>, 'data_interval_start': DateTime(2023, 2, 16, 20, 19, 55, tzinfo=Timezone('UTC')), 'data_interval_end': DateTime(2023, 2, 16, 20, 19, 55, tzinfo=Timezone('UTC')), 'last_scheduling_decision': None, 'dag_hash': 'a1c4fce80be1afad038a0ccd8a41efcf'}]
(Background on this error at: http://sqlalche.me/e/13/gkpj)

and

Traceback (most recent call last):
  File "/usr/local/airflow/.local/bin/airflow", line 8, in <module>
    sys.exit(main())
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/__main__.py", line 48, in main
    args.func(args)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/cli/cli_parser.py", line 48, in command
    return func(*args, **kwargs)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/utils/cli.py", line 92, in wrapper
    return f(*args, **kwargs)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/cli/commands/dag_command.py", line 138, in dag_trigger
    dag_id=args.dag_id, run_id=args.run_id, conf=args.conf, execution_date=args.exec_date
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/api/client/local_client.py", line 30, in trigger_dag
    dag_id=dag_id, run_id=run_id, conf=conf, execution_date=execution_date
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/api/common/experimental/trigger_dag.py", line 125, in trigger_dag
    replace_microseconds=replace_microseconds,
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/api/common/experimental/trigger_dag.py", line 75, in _trigger_dag
    f"A Dag Run already exists for dag id {dag_id} at {execution_date} with run id {run_id}"
airflow.exceptions.DagRunAlreadyExists: A Dag Run already exists for dag id test_dag at 2023-02-16 20:20:23+00:00 with run id test21__2023-02-16T20:20:04+061773
krozmok
  • 1
  • 1

0 Answers0