6

Is there a Snowflake command that will transform a table like this:

a,b,c
1,10,0.1
2,11,0.12
3,12,0.13

to a table like this:

key,value
a,1
a,2
a,3
b,10
b,11
b,13
c,0.1
c,0.12
c,0.13

?

This operation is often called melt in other tabular systems, but the basic idea is to convert the table into a list of key value pairs.

There is an UNPIVOT in SnowSQL, but as I understand it UNPIVOT requires to manually specify every single column. This doesn't seem practical for a large number of columns.

Haterind
  • 1,095
  • 1
  • 8
  • 16
  • 1
    I haven't tried this, but could you leverage something like this: https://github.com/jrideout/melt.js/blob/master/melt.js and wrap that into a Snowflake Javascript UDTF?? Might take some effort, but perhaps that'd work. Otherwise, `UNPIVOT` statements can also be generated dynamically using a stored procedure and executed that way, if needed. I believe `UNPIVOT` is the standard SQL way to do this...not just a Snowflake thing. – Mike Walton Jul 28 '21 at 01:58
  • for large number of columns, use the metadata tables which gives you the column names that you want to unpivot/melt instead of writing out each column name of the table. – George Joseph Jul 28 '21 at 02:10
  • @MikeWalton "Dynamic UNPIVOT" could be performed with SQL only. – Lukasz Szozda Jul 28 '21 at 14:30
  • 1
    @LukaszSzozda good stuff, man! Thanks. I've never looked into this method before. – Mike Walton Jul 29 '21 at 15:16

2 Answers2

9

Snowflake's SQL is powerful enough to perform such operation without help of third-party tools or other extensions.

Data prep:

CREATE OR REPLACE TABLE t(a INT, b INT, c DECIMAL(10,2))
AS
SELECT 1,10,0.1
UNION SELECT 2,11,0.12
UNION SELECT 3,12,0.13;

enter image description here

Query(aka "dynamic" UNPIVOT):

SELECT f.KEY, f.VALUE
FROM (SELECT OBJECT_CONSTRUCT_KEEP_NULL(*) AS j FROM t) AS s
,TABLE(FLATTEN(input => s.j)) f
ORDER BY f.KEY;

Output:

enter image description here


How does it work?

  1. Transform row into JSON(row 1 becomes { "A": 1,"B": 10,"C": 0.1 })
  2. Parse the JSON into key-value pairs using FLATTEN
Lukasz Szozda
  • 162,964
  • 23
  • 234
  • 275
0

Alternative approach using stored procedure Returning Tabular Data that wraps Python pandas.melt():

CREATE OR REPLACE PROCEDURE melt(tablename TEXT, id_vars ARRAY, value_vars ARRAY)
RETURNS TABLE()
LANGUAGE PYTHON
RUNTIME_VERSION = 3.8
PACKAGES =('snowflake-snowpark-python')
HANDLER = 'main'
AS
$$
import snowflake.snowpark as snowpark

def main(session: snowpark.Session, tablename: str, id_vars: list, value_vars: list): 
    return session.create_dataframe( \
        session.table(tablename).to_pandas().melt(id_vars,value_vars))
$$;

Input data:

CREATE OR REPLACE TABLE T(A INT, B INT, C DECIMAL(10,2))
AS
SELECT 1,10,0.1
UNION SELECT 2,11,0.12
UNION SELECT 3,12,0.13;

SELECT * FROM t;
A B C
1 10 0.1
2 11 0.12
3 12 0.13

Invoking stored procedure:

CALL melt('PUBLIC.T', NULL, NULL);
-- <=>
CALL melt('PUBLIC.T', [], ['A','B','C']);
/*
variable    value
A   1
A   2
A   3
B   10
B   11
B   12
C   0.1
C   0.12
C   0.13
*/

Further transformations on the result of stored procedure are possible:

SELECT * 
FROM TABLE(RESULT_SCAN(LAST_QUERY_ID())) 
WHERE "variable" = 'A';
/*
variable    value
A   1
A   2
A   3
*/

Other possibilities:

CALL melt('PUBLIC.T', ['A'], ['B','C']);
/*
A   variable    value
1   B   10
2   B   11
3   B   12
1   C   0.1
2   C   0.12
3   C   0.13
*/


CALL melt('PUBLIC.T', [], ['B']);
/*
variable    value
B   10
B   11
B   12
*/

Extras: unpivot_all:

CREATE OR REPLACE PROCEDURE unpivot_all(tablename TEXT)
RETURNS TABLE()
LANGUAGE PYTHON
RUNTIME_VERSION = 3.8
PACKAGES =('snowflake-snowpark-python')
HANDLER = 'main'
AS
$$
import snowflake.snowpark as snowpark

def main(session: snowpark.Session, tablename: str): 
    df = session.table(tablename)
    return df.unpivot("key", "value", df.columns)
$$;

All columns in the table should have the same data type

CALL unpivot_all('PUBLIC.T');
/*
VALUE   KEY
A   1
B   10
C   0.1
A   2
B   11
C   0.12
A   3
B   12
C   0.13
*/

Related: unpivot(tablename TEXT, cols ARRAY)

Lukasz Szozda
  • 162,964
  • 23
  • 234
  • 275