I'm using the Great Expectations python package (version 0.14.10) to validate some data. I've already followed the provided tutorials and created a great_expectations.yml in the local ./great_expectations
folder. I've also created a great expectations suite based on a .csv file version of the data (call this file ge_suite.json
).
GOAL: I want to use the ge_suite.json
file to validate an in-memory pandas DataFrame.
I've tried following this SO question answer with code that looks like this:
import great_expectations as ge
import pandas as pd
from ruamel import yaml
from great_expectations.data_context import DataContext
context = DataContext()
df = pd.read_pickle('/path/to/my/df.pkl')
batch_kwargs = {"datasource": "my_datasource_name", "dataset": df}
batch = context.get_batch(batch_kwargs=batch_kwargs, expectation_suite_name="ge_suite")
My datasources section of my great_expectations.yml file looks like this:
datasources:
my_datasource_name:
execution_engine:
module_name: great_expectations.execution_engine
class_name: PandasExecutionEngine
module_name: great_expectations.datasource
class_name: Datasource
data_connectors:
default_inferred_data_connector_name:
module_name: great_expectations.datasource.data_connector
base_directory: /tmp
class_name: InferredAssetFilesystemDataConnector
default_regex:
group_names:
- data_asset_name
pattern: (.*)
default_runtime_data_connector_name:
batch_identifiers:
- default_identifier_name
module_name: great_expectations.datasource.data_connector
class_name: RuntimeDataConnector
When I run the batch = context.get_batch(...
command in python I get the following error:
File "/Users/username/opt/miniconda3/envs/myenv/lib/python3.8/site-packages/great_expectations/data_context/data_context.py", line 1655, in get_batch
return self._get_batch_v2(
File "/Users/username/opt/miniconda3/envs/myenv/lib/python3.8/site-packages/great_expectations/data_context/data_context.py", line 1351, in _get_batch_v2
batch = datasource.get_batch(
AttributeError: 'Datasource' object has no attribute 'get_batch'
I'm assuming that I need to add something to the definition of the datasource in the great_expectations.yml file to fix this. Or, could it be a versioning issue? I'm not sure. I looked around for a while in the online documentation and didn't find an answer. How do I achieve the "GOAL" (defined above) and get past this error?