1

This question is a bit related to another question: Get List of Primary Key Columns in Snowflake.

Since INFORMATION_SCHEMA.COLUMNS does not provide the required information regarding the primary keys. And the method proposed by Snowflake itself, where you would describe the table followed by a result_scan, is unreliable when queries are run in parallel. I was thinking about using SHOW PRIMARY KEYs IN DATABASE. This works great when querying the database from within Snowflake. But as soon as I try to do this in python, I get results for the column name like 'Built-in function id'. Which is not useful when dynamically generating sql statements.

The code I am using is as follows:

SQL_PK = "SHOW PRIMARY KEYS IN DATABASE;"
snowflake_service = SnowflakeService(username=cred["username"], password=cred["password"])
snowflake_service.connect(database=DATABASE,role=ROLE, warehouse=WAREHOUSE)
curs = snowflake_service.cursor
primary_keys = curs.execute(SQL_PK).fetchall()
curs.close()
snowflake_service.connection.close()

Is there something I am doing wrong? Is it even possible to do it like this? Or is the solution that Snowflake provides reliable enough, when sending these queries as one string? Although with many tables, there will be many round trips required to get all the data needed.

BlueIced
  • 131
  • 1
  • 12
  • What do you mean when you say "queries are run in parallel"? Do you mean queries are run in parallel within sessions or on the database as a whole? Normally `result_scan` with `last_query_id` is fine because it only gets the last query in the current session – Simon D Apr 13 '21 at 19:47
  • Have you tried the UDTF that the post you reference provides? A UDTF would be something you could simply select from in python as if it is a table. – Mike Walton Apr 14 '21 at 01:59

1 Answers1

1

where you would describe the table followed by a result_scan, is unreliable when queries are run in parallel

You could search for specific query run using information_schema.query_history_by_session and then refer to resultset using retrieved QUERY_ID.

SHOW PRIMARY KEYS IN DATABASE;

-- find the newest occurence of `SHOW PRIMARY KEYS`:    
SET queryId = (SELECT QUERY_ID
               FROM TABLE(information_schema.query_history_by_session()) 
               WHERE QUERY_TEXT LIKE '%SHOW PRIMARY KEYS IN DATABASE%'
               ORDER BY ENDTIME DESC LIMIT 1);

SELECT * FROM TABLE(RESULT_SCAN($queryId));
Lukasz Szozda
  • 162,964
  • 23
  • 234
  • 275