Parsing SQL to determine complexity level

Question

I have to determine the complexity level (simple/medium/complex etc) of a sql by counting number of occurrences of specific keywords, sub-queries, derived tables, functions etc that constitute the sql. Additionally, I have to syntactically validate the sql.

I searched on the net and found that Perl has 2 classes named SQL::Statement and SQL::Parser which could be leveraged to achieve the same. However, I found that these classes have several limitations (such as CASE WHEN constructs not supported etc).

That been said, is it better to build a custom concise sql parser with Lex/Yacc or Flex/Bison instead ? Which approach would be better and quick ?

Please share your thoughts on this. Also, can anyone point me to any resources online that discusses the same.

Thanks

What database? Microsoft SQLServer, for one, has customized tools for doing this kind of analysis. — mob, Mar 29 '16 at 19:26

score 2 · Answer 1 · answered Mar 29 '16 at 23:30

Teradata has many non ANSI features and you're considering re-implementing the parser for it.

Instead use the database server and put an 'explain' in front of your statements and process the result.

explain select * from dbc.dbcinfo;

  1) First, we lock a distinct DBC."pseudo table" for read on a RowHash
     to prevent global deadlock for DBC.DBCInfoTbl.
  2) Next, we lock DBC.DBCInfoTbl in view dbcinfo for read.
  3) We do an all-AMPs RETRIEVE step from DBC.DBCInfoTbl in view
     dbcinfo by way of an all-rows scan with no residual conditions
     into Spool 1 (group_amps), which is built locally on the AMPs.
     The size of Spool 1 is estimated with low confidence to be 432
     rows (2,374,272 bytes).  The estimated time for this step is 0.01
     seconds.
  4) Finally, we send out an END TRANSACTION step to all AMPs involved
     in processing the request.
  -> The contents of Spool 1 are sent back to the user as the result of
     statement 1.  The total estimated time is 0.01 seconds.

This will also validate your SQL.

Parsing SQL to determine complexity level

1 Answers1