1

Great Expectations creates temporary tables. I tried profiling data in my Snowflake lab. It worked because the role I was using could create tables in the schema that contained the tables I was profiling.

I tried to profile a table in a Snowflake share, where we can't create objects, and it failed:

(snowflake.connector.errors.ProgrammingError) 002003 (02000): SQL compilation error:
Schema 'OUR_DATABASE.SNOWFLAKE_SHARE_SCHEMA' does not exist or not authorized.
[SQL: CREATE OR REPLACE TEMPORARY TABLE ge_temp_3eb6c50b AS SELECT * 
FROM "SNOWFLAKE_SHARE_SCHEMA"."INTERESTING_TABLE" 
WHERE true]
(Background on this error at: https://sqlalche.me/e/14/f405)

Here's the output from the CLI:

% great_expectations suite new                
Using v3 (Batch Request) API

How would you like to create your Expectation Suite?
    1. Manually, without interacting with a sample batch of data (default)
    2. Interactively, with a sample batch of data
    3. Automatically, using a profiler
: 3

A batch of data is required to edit the suite - let's help you to specify it.

Select data_connector
    1. default_runtime_data_connector_name
    2. default_inferred_data_connector_name
    3. default_configured_data_connector_name
: 3

Which data asset (accessible by data connector "default_configured_data_connector_name") would you like to use?
    1. INTERESTING_TABLE

Type [n] to see the next page or [p] for the previous. When you're ready to select an asset, enter the index.
: 1

Name the new Expectation Suite [INTERESTING_TABLE.warning]: 

Great Expectations will create a notebook, containing code cells that select from available columns in your dataset and
generate expectations about them to demonstrate some examples of assertions you can make about your data.

When you run this notebook, Great Expectations will store these expectations in a new Expectation Suite "INTERESTING_TABLE.warning" here:

  file:///path/to-my-repo/great_expectations/expectations/INTERESTING_TABLE/warning.json

Would you like to proceed? [Y/n]: Y

two-schemas-ro-has-data-rw-to-create-temporary-objects

Here's the datasources section from great_expectations.yml:

datasources:
    our_snowflake:
        class_name: Datasource
        module_name: great_expectations.datasource
        execution_engine:
        module_name: great_expectations.execution_engine
        credentials:
            host: xyz92716.us-east-1
            username: MYUSER
            query:
            schema: MYSCHEMA
            warehouse: MY_WAREHOUSE
            role: RW_ROLE
            password: password1234
            drivername: snowflake
        class_name: SqlAlchemyExecutionEngine
        data_connectors:
        default_runtime_data_connector_name:
            class_name: RuntimeDataConnector
            batch_identifiers:
            - default_identifier_name
            module_name: great_expectations.datasource.data_connector
        default_inferred_data_connector_name:
            include_schema_name: true
            class_name: InferredAssetSqlDataConnector
            introspection_directives:
            schema_name: SNOWFLAKE_SHARE_SCHEMA
            module_name: great_expectations.datasource.data_connector
        default_configured_data_connector_name:
            assets:
            INTERESTING_TABLE:
                schema_name: SNOWFLAKE_SHARE_SCHEMA
                class_name: Asset
                module_name: great_expectations.datasource.data_connector.asset
            class_name: ConfiguredAssetSqlDataConnector
            module_name: great_expectations.datasource.data_connector

How can I tweak great_expectations.yml so that temporary objects are created in a separate database and schema from the datasource?

As a workaround, we created a view in the schema with read/write that points to the data in the read-only share. That adds an extra step. I'm hoping there's a simple config to create temporary objects outside the schema being profiled.

Alex Woolford
  • 4,433
  • 11
  • 47
  • 80
  • Have you ever been able to figure this out? I'm wondering the same wrt BigQuery – Juergen Jun 16 '23 at 19:37
  • Update: this used to be possible with BQ but has been deprecated for a while now https://github.com/great-expectations/great_expectations/blob/09b39ec0d2df788ce8171012a2f0847d20f31ca8/tests/integration/db/bigquery.py#LL16C57-L16C76. Not sure what the situation is with Snowflake @ OP – Juergen Jun 19 '23 at 16:12

0 Answers0