Great Expectations is an open source software that helps teams promote analytic integrity by offering a unique approach to data pipeline testing. Pipeline tests are applied to data (instead of code) and at batch time (instead of compile or deploy time). Pipeline tests are like unit tests for datasets: they help you guard against upstream data changes and monitor data quality. In addition to pipeline testing GE also provides data documentation/profiling
Questions tagged [great-expectations]
131 questions
0
votes
1 answer
Mark Great Expectation validation as failed or passed based on a percentage of failure
I am using Great Expectations in my ETL data pipeline for a POC. I have a validation which is failing (as expected), and I have the following data in my validation JSON:
"unexpected_count": 205,
"unexpected_percent": 10.25,
…

Kuwali
- 233
- 3
- 13
0
votes
1 answer
Great expectations saving custom queries
Creating a batch and using a custom query does not save the query in the json file
For example:
batch_kwargs = {'data_asset_name': 'pgsql.data_asset_name',
'query': 'select entity_id,attribute from table_name where attribute > 50 and',
…

MariaMadalina
- 479
- 6
- 20
0
votes
1 answer
How to Save Great Expectations Html validation results to Databricks DBFS or Azure Blob
Sometime ago I asked the question
How to Save Great Expectations results to File From Apache Spark -
With Data Docs
How to Save Great Expectations results to File From Apache Spark - With Data Docs
The answers centred on viewing the results in…

Patterson
- 1,927
- 1
- 19
- 56
0
votes
1 answer
how to execute great expectations results into azure functions?
I am trying to deploy Azure functions with Great_expectations folders. The function is getting executed in the local system but facing the below error while executing the function in the portal.
Result: Failure Exception: OSError: [Errno 30]…

Manasa Murugan
- 23
- 1
- 9
0
votes
0 answers
'Datasource' object has no attribute 'get_batch'
I am trying to integrate the great expectations in airflow pipeline. I followed this url to integrate. But I am getting error object has no attribute however Context instance has that attribute. Here is my code.
from pathlib import Path
from…

c__c
- 1,574
- 1
- 19
- 39
0
votes
1 answer
Column names in great expectations
Are there any specific rules for column names in great expectations? In particular, if you have a column like a.age ? would it have to be renamed to a_age in order to run an expectation on it?

stackguy1723
- 165
- 1
- 2
- 12
0
votes
0 answers
Great Expectation Validation Failed and Job aborted
I am working on a Data Monitoring task where I am using the Great Expectation framework to monitor the quality of the data. I am using the airflow+big query+great expectation together to achieve this.
I have set the param is_blocking:False for…

Jack Daniel
- 2,527
- 3
- 31
- 52
0
votes
1 answer
RunGreatExpectationsValidation execution returns an exception
I am struggling on a great_expectations integration problem.
I obviously use RunGreatExpectationsValidation task with:
validation_task = RunGreatExpectationsValidation()
with Flow(
"GE_pull_and_run",
) as GE_pull_and_run_flow:
.......
…

vBob
- 3
- 1
0
votes
0 answers
How to Change the defaults in Great Expectations DataDoc HTML Report?
Great Expectations provides the ability to produce Html reports using DataDocs as shown in the folloiwng example:
I would like the change the following defaults in the header - see image
The report is generated using the…

Patterson
- 1,927
- 1
- 19
- 56
0
votes
1 answer
How to code Unsuccessful, Failed results in Great_Expectations
I am evaluating Great Expectations to do some Data Cleaning.
I have managed to get most of code working for our needs. I am having a problem with the Attribute needed to code for unsuccessful results. For example, the following code will print…

Patterson
- 1,927
- 1
- 19
- 56
0
votes
1 answer
great_expectations : expect_column_values_to_match_json_schema does not take json schema as input
I am trying to invoke
expect_column_values_to_match_json_schema
as…

AbtPst
- 7,778
- 17
- 91
- 172
0
votes
1 answer
How to create Great Expectations checkpoint for Pandas dataframe?
My datasource config looks like:
datasource_config = {
"name": "example_datasource",
"class_name": "Datasource",
"module_name": "great_expectations.datasource",
"execution_engine": {
"module_name":…

Valentyn
- 562
- 1
- 7
- 21
0
votes
1 answer
There should be one and only one value in column B for every value in column A - Pandas
I have a data frame as shown in image:
I want an output similar to the Value Column.
What it means is for every value in column A, there can only be one and only one value in column B.
Even if value in column A repeats, the value in column B should…

Vivek Shukla
- 13
- 4
0
votes
1 answer
An unexpected keyword argument '_metrics' in _pandas for custom expectations with great expectations v3 api?
I am trying to create a very simple expectation with Great Expectations v3 api: expect_column_values_to_be_positive. I am using PandasExecutionEngine and my data asset is a pandas dataframe.
my_custom_expectation.py is located in the plugins/…
0
votes
1 answer
Airflow - Great Expectations - Send evaluation parameters through to GreatExpectationsOperator
For anyone that has used GreatExpectations in airflow, does anyone know if it is possible to send evaluation_parameters through with the airflow GreatExpectationsOperator? I am currently trying this and receiving the…

adan11
- 647
- 1
- 7
- 24