1

I'm trying to use the REGEXP_SUBSTR method in Snowflake to extract a string after the period. I have the following possibilities:

  • ALTER TABLE DATABASE_NAME.SCHEMA_NAME.TABLE_NAME ADD COLUMN ....
  • ALTER TABLE SCHEMA_NAME.TABLE_NAME ADD COLUMN ....
  • ALTER TABLE TABLE_NAME ADD COLUMN ....

With the following possibilities, I need to be able to extract TABLE_NAME. So far, I've tried this which is able to get the third possibility:

select regexp_substr(query_text, 'ALTER TABLE ([0-9A-Za-z\\-\\_]+) ADD COLUMN', 1,1, 'e', 1);

I'm not sure how I could capture all three possibilities in one regular expression. Would appreciate any pointers. Thank you.

kiwiLime
  • 115
  • 1
  • 11

2 Answers2

1

Using regexp and split_part:

WITH cte(query_text) AS (
   SELECT 'ALTER TABLE DATABASE_NAME.SCHEMA_NAME.TABLE_NAME ADD COLUMN ....' UNION
   SELECT 'ALTER TABLE SCHEMA_NAME.TABLE_NAME ADD COLUMN ....' UNION
   SELECT 'ALTER TABLE TABLE_NAME ADD COLUMN ....'
)
select 
  regexp_substr(query_text, 'ALTER TABLE (.+) ADD COLUMN', 1,1, 'e', 1) as obj_name,
  split_part(obj_name, '.', -1) AS tab_name,
  split_part(obj_name, '.', -2) AS schema_name,
  split_part(obj_name, '.', -3) AS db_name
FROM cte;

Output:

OBJ_NAME TAB_NAME SCHEMA_NAME DB_NAME
DATABASE_NAME.SCHEMA_NAME.TABLE_NAME TABLE_NAME SCHEMA_NAME DATABASE_NAME
SCHEMA_NAME.TABLE_NAME TABLE_NAME SCHEMA_NAME
TABLE_NAME TABLE_NAME
Lukasz Szozda
  • 162,964
  • 23
  • 234
  • 275
0
select regexp_substr(query_text, 'ALTER TABLE ([0-9A-Za-z\\-\\_.]+\.)?([0-9A-Za-z\\-\\_]+) ADD COLUMN', 1, 1, 'e', 2);

Will do what you described.

select regexp_substr(query_text, 'ALTER TABLE (["0-9A-Za-z\\-\\_.]+\.)?"?([0-9A-Za-z\\-\\_]+)"? ADD COLUMN', 1, 1, 'e', 2);

Will also extract table name if it is in quotes.

markalex
  • 8,623
  • 2
  • 7
  • 32