0

I have a snowflake file with a query like as below, in the snowflake operator if I have a return so that I can pass xcom to the next task.

How can I get only the last query id to be returned for xcom ? Basically I need to get the snowflake last query id to xcom

In SQL File:

select columns from tableA ;
select last_query_id();

Error : Multiple SQL statements in a single API call are not supported; use one API call per statement instead.

or is there a way I can get below query id returned to xcom enter image description here

Code:

  class LastQueryId(SnowflakeOperator):

    def execute(self, context: Any) -> None:
        """Run query on snowflake"""
        self.log.info('Executing: %s', self.sql)
        hook = SnowflakeHook(snowflake_conn_id=self.snowflake_conn_id,
                             warehouse=self.warehouse, database=self.database,
                             role=self.role, schema=self.schema, authenticator=self.authenticator)
        result = hook.run(self.sql, autocommit=self.autocommit, parameters=self.parameters)
        self.query_ids = hook.query_ids

        if self.do_xcom_push and len(self.query_ids) > 0:
            return self.query_ids[-1]

UPDATED: I was able to get the query id of the snowflake with above code but in the log, I also see the result of the query, how can I avoid those in the log

[2022-06-23, 20:43:39 UTC] {cursor.py:696} INFO - query: [SELECT  modifieddate, documentdate...]
[2022-06-23, 20:43:40 UTC] {cursor.py:720} INFO - query execution done
[2022-06-23, 20:43:56 UTC] {snowflake.py:307} INFO - Statement execution info - {'MODIFIEDDATE': datetime.datetime(2022, 6, 23, 11, 42, 34, 233000), 'DOCUMENTDATE': datetime.datetime(2015, 10, 1, 0, 0)...}
[2022-06-23, 20:43:56 UTC] {snowflake.py:307} INFO - Statement execution info - {'MODIFIEDDATE': datetime.datetime(2022, 6, 23, 13, 50, 45, 377000), 'DOCUMENTDATE': datetime.datetime(2021, 7, 1, 0, 0)...}
[2022-06-23, 20:43:56 UTC] {snowflake.py:307} INFO - Statement execution info - {'MODIFIEDDATE': datetime.datetime(2022, 6, 23, 11, 36, 51, 583000), 'DOCUMENTDATE': datetime.datetime(2015, 8, 31, 0, 0)...}
....
....
....
[2022-06-23, 20:43:56 UTC] {snowflake.py:311} INFO - Rows affected: 22116
[2022-06-23, 20:43:56 UTC] {snowflake.py:312} INFO - Snowflake query id: 01a5259b-0501-98f3-0251-830144baa623
[2022-06-23, 20:43:56 UTC] {connection.py:509} INFO - closed
Kar
  • 790
  • 13
  • 36

1 Answers1

1

SnowflakeOperator already store the query_ids but it does not push them to xcom.

You can create a custom operator as:

class MySnowflakeOperator(SnowflakeOperator):

    def execute(self, context: Any) -> None:
        """Run query on snowflake"""
        self.log.info('Executing: %s', self.sql)
        hook = self.get_db_hook()
        execution_info = hook.run(self.sql, autocommit=self.autocommit, parameters=self.parameters)
        self.query_ids = hook.query_ids

        if self.do_xcom_push and len(self.query_ids) > 0:
            return self.query_ids[-1]  # last query_id

If you want to maintain the original operator functionality then you can do:

class MySnowflakeOperator(SnowflakeOperator):

    def execute(self, context: Any) -> None:
        parent_return_value = super().execute(context)
        if self.do_xcom_push and len(self.query_ids) > 0:
            self.xcom_push(
                context,
                key="last_query_id",
                value=self.query_ids[-1],
            )   
        return parent_return_value
Elad Kalif
  • 14,110
  • 2
  • 17
  • 49
  • Yes i just did it before you answered me , Still I will look into yours and try it .. thanks a lot @Elad Kalif.. Can you please check my original post UPDATED ONE and let me know if it is good – Kar Jun 23 '22 at 19:22
  • I have one issue, why is it printing in the airflow log even the results of the query Please check UPDATED – Kar Jun 23 '22 at 20:52
  • This is a sperated and unrelated issue. Please open a new question for this. – Elad Kalif Jun 24 '22 at 09:32