0

I'm trying to create and run a checkpoint of great expectation, for this I created this Python script:

import sys
from datetime import datetime

from great_expectations.data_context import DataContext
from great_expectations.validation_operators.types.validation_operator_result import (
    ValidationOperatorResult,
)
import great_expectations as gx
from airflow import AirflowException


def execute_new_checkpoint(
    ge_root_dir: str,
    datasource_name: str,
    data_connector_name: str,
    data_asset_name: str,
    checkpoint_name: str,
) -> None:
    """ Execute a new checkpoint """
    context: DataContext = gx.data_context.DataContext(ge_root_dir)

    context.add_checkpoint(
        name=checkpoint_name,
        batch_request={
            "datasource_name": datasource_name,
            "data_connector_name": data_connector_name,
            "data_asset_name": data_asset_name,
        },
    )
    results: ValidationOperatorResult = context.run_checkpoint(
        checkpoint_name=checkpoint_name,
        run_id=f"airflow: {checkpoint_name}:{datetime.now():%Y%m%dT%H%M%S}",
    )

    if not results["success"]:
        raise AirflowException("Validation of the data is not successful ")


if __name__ == "__main__":
    ...
    execute_new_checkpoint(*sys.argv[1:])

And when I run this obtain a Warning and an Error

$ python ge_run_checkpoint.py /path/to/great_expectations/ my_datasource my_data_connector my_data_asset my_checkpoint

...

{util.py:56} WARNING - Instantiating class from config without an explicit class_name is dangerous. Consider adding an explicit class_name for tests1

...

KeyError: "Neither config : {'name': 'tests1', 'batch_request': {'datasource_name': 'my_datasource', 'data_connector_name': 'my_data_connector', 'data_asset_name': 'my_data_asset'}} nor config_defaults : {} contains a class_name key."

I've tried to add a class_nam Asset and ConfiguredAssetSqlDataConnector, but also obtain a error:

- Please verify that the class named `ConfiguredAssetSqlDataConnector` exists.

I want to create and run a checkpoint of great expectation for validate the expectations.

1 Answers1

0

I improved the script with:

def execute_new_checkpoint(
    ge_root_dir: str,
    datasource_name: str,
    data_connector_name: str,
    data_asset_name: str,
    checkpoint_name: str,
    expectation_name: str = "default",
) -> None:
    context: DataContext = gx.data_context.DataContext(ge_root_dir)

    context.add_checkpoint(
        name=checkpoint_name,
        config_version=1.0,
        class_name="SimpleCheckpoint",
        run_name_template=f"%Y%m%d-%H%M%S-{re.sub('[_ ]', '-', checkpoint_name.lower())}",
        validations=[
            {
                "batch_request": {
                    "datasource_name": datasource_name,
                    "data_connector_name": data_connector_name,
                    "data_asset_name": data_asset_name,
                    "data_connector_query": {"index": -1},
                },
                "expectation_suite_name": expectation_name,
            }
        ],
    )
    ...

It was necessary to modify the built checkpoint and add the expectation suit name.

This based the jupyters created by the command:

$ great_expectations --v3-api suite new