Great Expectations is an open source software that helps teams promote analytic integrity by offering a unique approach to data pipeline testing. Pipeline tests are applied to data (instead of code) and at batch time (instead of compile or deploy time). Pipeline tests are like unit tests for datasets: they help you guard against upstream data changes and monitor data quality. In addition to pipeline testing GE also provides data documentation/profiling
Questions tagged [great-expectations]
131 questions
0
votes
1 answer
How to create a `BatchRequest` that returned a specific batch?
I am trying to create a BatchRequest that filters using data_connector_query that I previously defined in the datasource as group_names for the default_regexp pertaining to an InferredAssetS3DataConnector.
Here's the datasource config:
config = {
…

Imad
- 2,358
- 5
- 26
- 55
0
votes
1 answer
Great Expectations: expect_values_to_match_regex rule gives out error for all regex
NOTE: Running on Snowflake
I actually need the regex to check for SSN. The one I'm using is -
Regex used: '^(?!666|000|9\\d{2})\\d{3}-(?!00)\\d{2}-(?!0{4})\\d{4}$'
Error message:
Invalid regular expression:…

manas swami
- 1
- 1
0
votes
1 answer
How to use dagster with great expectations?
The issue
I'm trying out great expectations with dagster, as per this guide
My pipeline seems to execute correctly until it reaches this block:
expectation = dagster_ge.ge_validation_op_factory(
name='ge_validation_op',
…

Imad
- 2,358
- 5
- 26
- 55
0
votes
0 answers
How to set up a conditional expectations for Snowflake with Great Expectations?
I'm trying to set up a Conditional Expectation with Great Expectations for a Snowflake table that's in a long format.
For this tests, consider a table with two columns, measurement_type and value. I want to check that the value is between 0 and 360…

canuckdownunder
- 1
- 1
0
votes
0 answers
row_condition based on length of a column great expectations
I am trying to validate an id column against expect_column_values_to_match_regex with row_condition based on id column's length.
I have a specific regex to apply based on the length of the id.
How do I do it with row_condition. The examples in the…

Dasa Sathyan
- 17
- 5
0
votes
0 answers
Tag great expectation results
When using an ExpectationStore such as S3, is there a field you can include as a way to date/tag the GE json result such that it shows up on the GE UI to be part of specific dated batch. For example you run daily or hourly jobs, and you would like…

pyCthon
- 11,746
- 20
- 73
- 135
0
votes
1 answer
GreatExpectations' data_docs (tutorial) fail to work correctly on WSL2
I'm testing out Great Expectations by following this tutorial:
Unfortunately my jupyter notebooks could not open the browser direclty at first, but I was able to fix that behavior by following this thread, which has to do with Jupyter notebook…

Imad
- 2,358
- 5
- 26
- 55
0
votes
1 answer
Can't get expect_table_columns_to_match_set to work
If I define df_asset as follows
import great_expectations as ge
df_asset = ge.from_pandas(pd.DataFrame({'A': [1.1, 2.2, 3.3], 'B': [4.4, 5.5, 6.6]}))
then the expect_table_columns_to_match_ordered_list method works (output on 2nd…

Elis
- 70
- 10
0
votes
1 answer
great expectation with delta table
I am trying to run a great expectation suite on a delta table in Databricks. But I would want to run this on part of the table with a query. Though the validation is running fine, it's running on full table data.
I know that I can load a Dataframe…

S.Dasgupta
- 61
- 9
0
votes
1 answer
Great Expectation - Error: KeyError: "Neither config : {...} nor config_defaults : {} contains a class_name key."
I'm trying to create and run a checkpoint of great expectation, for this I created this Python script:
import sys
from datetime import datetime
from great_expectations.data_context import DataContext
from…

Fabian Matias Vega Alcota
- 31
- 1
- 4
0
votes
1 answer
Great Expectations – Generating Data Doc Without CLI on In-Memory Pandas Dataframe
I am new to the Great Expectations package. I found this tutorial for connecting to a data source, validating the data and visualising the output as a data doc which is saved as an html.…
0
votes
1 answer
Case sensitive table/column names in Snowflake datasource for Great Expectations
I’ve noticed that my Snowflake expectations only work when the column parameter is written in lowercase (even if the original column is in uppercase):
"expectations": [
{
"expectation_type": "expect_column_values_to_be_unique",
…

Joaquín L. Robles
- 6,261
- 10
- 66
- 96
0
votes
1 answer
ImportError: cannot import name '_device' from partially initialized module 'zmq.backend.cython'
Trying to get great expectations running on windows 10 laptop. Below is what I get when I enter
great_expectations --version
I've installed it no problem on my desktop and my mac but can't figure what the issue is here. This laptop does have…

Jaehaerys68
- 11
- 4
0
votes
0 answers
Great expectations: why does profiler throws error up to 0.15.18 but not later?
This dataframe:
df = pd.DataFrame(
[
{"name": "Ross", "dob": pd.Timestamp("1967-10-18")},
{"name": "Rachel", "dob": pd.Timestamp("1968-05-05")},
{"name": "Phoebe", "dob": None},
]
)
Would cause the…

Jorge Cespedes
- 547
- 1
- 11
- 21
0
votes
3 answers
How does one run Great Expectations from Docker using a Dockerfile to build the image
I am pretty new to Great Expectations (GX) and very new to Docker, and now I am trying to combine the two. I can get a Docker image to build just fine, but when I try to run a container, it fails. I can get my GX Checkpoint to run from both the GX…

Timmy Beatty
- 1
- 4