1

I have this folder structure for my Great Expectations project:

great_expectations/
    dataset/
        __init__.py
        oracle_dataset.py
    datasource/
        __init__.py
        oracle_datasource.py
    great_expectations.yml

datasource/__init__.py:

from .oracle_datasource import OracleDatasource

dataset/__init__.py:

from .oracle_dataset import OracleDataset

great_expectations.yml:

datasources:
  db_name:
    credentials: ${db_name}
    data_asset_type:
      class_name: OracleDataset
      module_name: .dataset
    class_name: OracleDatasource
    module_name: .datasource

On top of the fact that python relative imports are very confusing to me, I also am not sure which reference directory is used when running the great_expectations commands. When I try: great_expectations suite new I get the error message: ValueError: no package specified for '.datasource' (required for relative module names)

I think the above .yml is still the way to go after trying everything below. I'm guessing there is something I don't understand about relative imports that needs to be handled in the init or elsewhere.

Edit: I have also tried:

datasources:
  db_name:
    credentials: ${db_name}
    data_asset_type:
      class_name: OracleDataset
      module_name: great_expectations.dataset
    class_name: OracleDatasource
    module_name: great_expectations.datasource

The module: 'great_expectations.datasource' does not contain the class: 'OracleDatasource'.

I think this message means it is looking in the great_expectations library and I confirmed this by trying a class name that is included in the library.

and this:

datasources:
  db_name:
    credentials: ${db_name}
    data_asset_type:
      class_name: OracleDataset
      module_name: dataset
    class_name: OracleDatasource
    module_name: datasource

No module named "datasource" could be found in the repository. Please make sure that the file, corresponding to this package and module, exists and that dynamic loading of code modules, templates, and assets is supported in your execution environment. This error is unrecoverable.

I think this means it is looking outside of the library but can't find the file.

and this:

datasources:
  db_name:
    credentials: ${db_name}
    data_asset_type:
      class_name: OracleDataset
      module_name: dataset.oracle_dataset
    class_name: OracleDatasource
    module_name: datasource.oracle_datasource

No module named "datasource.oracle_datasource" could be found in the repository. Please make sure that the file, corresponding to this package and module, exists and that dynamic loading of code modules, templates, and assets is supported in your execution environment. This error is unrecoverable.

I think this means it is looking outside of the library but can't find the file.

Pierre Delecto
  • 455
  • 1
  • 7
  • 26

1 Answers1

0

To extend Great Expectations use the /plugins directory in your project (this folder is created automatically when you run great_expectations init). Modules added here can be referenced in your configuration.

Add oracle_datasource.py and oracle_dataset.py to the plugins folder:

.
├── custom_data_docs
│   ├── renderers
│   ├── styles
│   │   └── data_docs_custom_styles.css
│   └── views
├── oracle_dataset.py
└── oracle_datasource.py

Edit your yaml in great_expectations.yml as follows (note we handle module names relative to this folder so no need for dot paths if your new modules are at the root of the plugins folder):

datasources:
  db_name:
    credentials: ${db_name}
    data_asset_type:
      class_name: OracleDataset
      module_name: oracle_dataset
    class_name: OracleDatasource
    module_name: oracle_datasource

One last thing - I'm sure the Great Expectations community would love to have an Oracle Datasource and Dataset if you would be willing to contribute it back into the main package! https://docs.greatexpectations.io/en/latest/contributing.html

aburdi
  • 31
  • 3
  • 1
    Yes, when I try this it seems to look in the actual great_expectations library: `The module: 'great_expectations.datasource' does not contain the class: 'OracleDatasource'.` – Pierre Delecto Mar 25 '21 at 15:18
  • I'm sorry - this is incorrect for extending Great Expectations, I'll edit it to a better answer shortly using the plugins/ directory. – aburdi Mar 26 '21 at 15:59
  • @PierreDelecto - I hope this helped! If so please mark this question as answered. – aburdi Apr 05 '21 at 18:59