My use case is to accept Azure Synapse DW DDL as input, parse it and extract SQL statements from it. Additionally I can benefit from not only the base statements (viz, DECLARE, CREATE, ALTER, SELECT, ...) but also other DDL constructs like PK constraints, Indexing, Distribution, partitioning etc.
Reaching out to understand if there is a known way to achieve this? Much thanks in advance.
Things I've tried:
TsqlParser (all 8 available versions 80,90,100,110,120,130,140,150) but it fails to parse the DDL and errors out on few tokens which are specific to DW.
Passed a custom TSqlConcreteFragmentVisitor to TSqlFragment but it didn't detect Create table statements either (It could do a visit on SELECT statements though, not shown in attached DDL here)
Used Microsoft.SqlServer.Management.SqlParser.Parser.Scanner to read SQL batches from the DDL but it identified the whole DDL as a single batch. This is not helpful as I am essentially getting the same output as the input I passed in.
Used TSQL.TSQLStatementReader to tokenise the DDL but it gives individual tokens and not SQL statements (For example, in fiddle here, PK constraint NOT ENFORCED is tokenised as 2 tokens and not a single constraint as it should be.
Pls make a note that if I remove DW specific constructs like PK constraint NOT ENFORCED
and WITH (...)
block manually, then TsqlParser successfully parses the DDL and correctly brings out Create statements with tables, columns and datatypes.
Sample DDL:
IF NOT EXISTS (SELECT * FROM sys.schemas WHERE name = N'tempschema_main')
EXEC('CREATE SCHEMA [schema_main]')
CREATE TABLE [schema_main].[temp_table]
(
[TempId1] NVARCHAR(128) NOT NULL,
[TempId2] NVARCHAR(128) NOT NULL,
CONSTRAINT PK_TempPkColumnId PRIMARY KEY NONCLUSTERED (TempId1, TempId2) NOT ENFORCED
)
WITH
(
CLUSTERED COLUMNSTORE INDEX,
DISTRIBUTION = ROUND_ROBIN
)