How to run great expectations on AWS lambda

Question

I am trying to use great-expectations, i.e., run expectations suites within an AWS Lambda function.

When I am trying to install the packages in the requirements.txt, I get an error re jupyter lab:

aws-sam\\build\\ValidationFunction\\.\\jupyterlab_widgets-1.1.0.data\\data\\share\\jupyter\\labextension
s\\@jupyter-widgets\\jupyterlab-manager\\schemas\\@jupyter-widgets\\jupyterlab-manager\\package.json.orig'

I am using SAM CLI, version 1.42.0 and am trying to build the function inside a container. Python version 3.9.

Requirements.txt:

    botocore
    boto3
    awslambdaric
    awswrangler
    pandas_profiling
    importlib-metadata==2.0
    great-expectations==0.13.19
    s3fs==2021.6.0
    python-dateutil==2.8.1
    aiobotocore==1.3.0
    requests==2.25.1
    decorator==4.4.2
    pyarrow==2

I read several posts on the internet using Lambda functions to run Great Expectations. However, there are none reporting any issues.

Specifically, the question is does anyone have a solution for running Python code on Lambda functions when the dependencies are a large set of Python packages?

score 0 · Answer 1 · answered Oct 22 '22 at 08:36

Can you show a bit more of your code and the full error stack? I would start as simple as possible get basic validation working and then add back in dependencies until you find the culprit.

Add a simple lambda and the minimum dependencies, maybe pandas and great expectations and then validate one rule as in:

custom_expectation_suite = ExpectationSuite(expectation_suite_name="deliverable_rules.custom")

custom_expectation_suite.add_expectation(
    ExpectationConfiguration(expectation_type="expect_column_values_to_not_be_null",
                             kwargs={'column': 'first_name'
                             meta={'reason': f'first name should not be null'}))


validation_result = data_frame_to_validate.validate(custom_expectation_suite, run_id=run_id)

How to run great expectations on AWS lambda

1 Answers1