Questions tagged [great-expectations]

Great Expectations is an open source software that helps teams promote analytic integrity by offering a unique approach to data pipeline testing. Pipeline tests are applied to data (instead of code) and at batch time (instead of compile or deploy time). Pipeline tests are like unit tests for datasets: they help you guard against upstream data changes and monitor data quality. In addition to pipeline testing GE also provides data documentation/profiling

131 questions
0
votes
1 answer

Mark Great Expectation validation as failed or passed based on a percentage of failure

I am using Great Expectations in my ETL data pipeline for a POC. I have a validation which is failing (as expected), and I have the following data in my validation JSON: "unexpected_count": 205, "unexpected_percent": 10.25, …
Kuwali
  • 233
  • 3
  • 13
0
votes
1 answer

Great expectations saving custom queries

Creating a batch and using a custom query does not save the query in the json file For example: batch_kwargs = {'data_asset_name': 'pgsql.data_asset_name', 'query': 'select entity_id,attribute from table_name where attribute > 50 and', …
MariaMadalina
  • 479
  • 6
  • 20
0
votes
1 answer

How to Save Great Expectations Html validation results to Databricks DBFS or Azure Blob

Sometime ago I asked the question How to Save Great Expectations results to File From Apache Spark - With Data Docs How to Save Great Expectations results to File From Apache Spark - With Data Docs The answers centred on viewing the results in…
Patterson
  • 1,927
  • 1
  • 19
  • 56
0
votes
1 answer

how to execute great expectations results into azure functions?

I am trying to deploy Azure functions with Great_expectations folders. The function is getting executed in the local system but facing the below error while executing the function in the portal. Result: Failure Exception: OSError: [Errno 30]…
0
votes
0 answers

'Datasource' object has no attribute 'get_batch'

I am trying to integrate the great expectations in airflow pipeline. I followed this url to integrate. But I am getting error object has no attribute however Context instance has that attribute. Here is my code. from pathlib import Path from…
c__c
  • 1,574
  • 1
  • 19
  • 39
0
votes
1 answer

Column names in great expectations

Are there any specific rules for column names in great expectations? In particular, if you have a column like a.age ? would it have to be renamed to a_age in order to run an expectation on it?
stackguy1723
  • 165
  • 1
  • 2
  • 12
0
votes
0 answers

Great Expectation Validation Failed and Job aborted

I am working on a Data Monitoring task where I am using the Great Expectation framework to monitor the quality of the data. I am using the airflow+big query+great expectation together to achieve this. I have set the param is_blocking:False for…
Jack Daniel
  • 2,527
  • 3
  • 31
  • 52
0
votes
1 answer

RunGreatExpectationsValidation execution returns an exception

I am struggling on a great_expectations integration problem. I obviously use RunGreatExpectationsValidation task with: validation_task = RunGreatExpectationsValidation() with Flow( "GE_pull_and_run", ) as GE_pull_and_run_flow: ....... …
vBob
  • 3
  • 1
0
votes
0 answers

How to Change the defaults in Great Expectations DataDoc HTML Report?

Great Expectations provides the ability to produce Html reports using DataDocs as shown in the folloiwng example: I would like the change the following defaults in the header - see image The report is generated using the…
Patterson
  • 1,927
  • 1
  • 19
  • 56
0
votes
1 answer

How to code Unsuccessful, Failed results in Great_Expectations

I am evaluating Great Expectations to do some Data Cleaning. I have managed to get most of code working for our needs. I am having a problem with the Attribute needed to code for unsuccessful results. For example, the following code will print…
Patterson
  • 1,927
  • 1
  • 19
  • 56
0
votes
1 answer

great_expectations : expect_column_values_to_match_json_schema does not take json schema as input

I am trying to invoke expect_column_values_to_match_json_schema as…
AbtPst
  • 7,778
  • 17
  • 91
  • 172
0
votes
1 answer

How to create Great Expectations checkpoint for Pandas dataframe?

My datasource config looks like: datasource_config = { "name": "example_datasource", "class_name": "Datasource", "module_name": "great_expectations.datasource", "execution_engine": { "module_name":…
Valentyn
  • 562
  • 1
  • 7
  • 21
0
votes
1 answer

There should be one and only one value in column B for every value in column A - Pandas

I have a data frame as shown in image: I want an output similar to the Value Column. What it means is for every value in column A, there can only be one and only one value in column B. Even if value in column A repeats, the value in column B should…
0
votes
1 answer

An unexpected keyword argument '_metrics' in _pandas for custom expectations with great expectations v3 api?

I am trying to create a very simple expectation with Great Expectations v3 api: expect_column_values_to_be_positive. I am using PandasExecutionEngine and my data asset is a pandas dataframe. my_custom_expectation.py is located in the plugins/…
0
votes
1 answer

Airflow - Great Expectations - Send evaluation parameters through to GreatExpectationsOperator

For anyone that has used GreatExpectations in airflow, does anyone know if it is possible to send evaluation_parameters through with the airflow GreatExpectationsOperator? I am currently trying this and receiving the…
adan11
  • 647
  • 1
  • 7
  • 24
1 2 3
8
9