3

I am writing a unit test for my ETLs and as a process, I want to test all Dags to make sure that they do not have cycles. After reading Data Pipelines with Apache Airflow by Bas Harenslak and Julian de Ruiter I see they are using DAG.test_cycle(), the DAG here is imported from the module airflow.models.dag but when I run the code I get an error that AttributeError: 'DAG' object has no attribute 'test_cycle'

Here is my code snippet

import glob
import importlib
import os

import pytest
from airflow.models.dag import DAG

DAG_PATH = os.path.join(os.path.dirname(file), “…”, “…”, “dags/**/*.py”)
DAG_FILES = glob.glob(DAG_PATH, recursive=True)

@pytest.mark.parametrize("dag_file", DAG_FILES)
def test_dag_integrity(dag_file):
    module_name, _ = os.path.splitext(dag_file)
    module_path = os.path.join(DAG_PATH, dag_file)
    mod_spec = importlib.util.spec_from_file_location(module_name, module_path)
    module = importlib.util.module_from_spec(mod_spec)
    mod_spec.loader.exec_module(module)

    dag_objects = [var for var in vars(module).values() if isinstance(var, DAG)]

    assert dag_objects

    for dag in dag_objects:
        dag.test_cycle()

3 Answers3

7

In Airflow 2.0.0 or greater, you could use test_cycle() function that takes a dag as argument:

def test_cycle(dag):
    """
    Check to see if there are any cycles in the DAG. Returns False if no cycle found,
    otherwise raises exception.
    """

Source

Import like this:

from airflow.utils.dag_cycle_tester import test_cycle

You could find an example in the definition of DagBag class.

NicoE
  • 4,373
  • 3
  • 18
  • 33
  • 3
    For some reason i get the below error when running this. Any ideas on how to resolve? def test_cycle(dag): E fixture 'dag' not found > available fixtures: anyio_backend, anyio_backend_name, anyio_backend_options, cache, capfd, capfdbinary, caplog, capsys, capsysbinary, celery_app, celery_class_tasks, c – adan11 Jul 25 '21 at 08:45
  • 2
    @adan11 Your issue is that when you put an argument in any function with a name starting with "test" in a pytest suite, pytest thinks it's a test, and assumes the arguments being passed are fixtures. If you have a newer version of Airflow, this function has been renamed to "check_cycle". If you don't, simply alias the function on import as Makr did in their answer on this page. – JPKab Jun 13 '22 at 18:34
5

In case someone faced the same issue as @adan11 have mentioned when using the test_cycle method, the problem with that is when we compile our test suite, our test tool ie. pytest, will parse the test_cycle method upon importing it.

e.g

# Pytest will collect this function
from airflow.utils.dag_cycle_tester import test_cycle

Pytest will ignore it when we use an alias name like below:

from airflow.utils.dag_cycle_tester import test_cycle as _test_cycle

This will work depending on your test configuration.

Makr
  • 51
  • 1
  • 1
1

Update on the above answers:

Due to the conflicts caused by having a function with "test" in the name, Airflow team renamed the function to check_cycle

from airflow.utils.dag_cycle_tester import check_cycle

To run the actual test, change the final two line to read check_cycle(dag)

flexponsive
  • 6,060
  • 8
  • 26
  • 41
JPKab
  • 221
  • 3
  • 10