6

I am going to provide SQL console like data.stackexchange.com for developers on one of the sites I have. The problem is I can not let them execute all type of SQL. So there will be some constrains. Here are some example constrains

  • No query can contain COUNT() function
  • Every query must contain LIMIT with maximum value 100
  • Number of column can not be more than 5
  • Some tables' data will not be accessible due to privacy region.
  • Only SELECT statements will be allowed to be used.

What I am doing right now?

I using regular expression to filter these. Also planning to invoke EXPLAIN or similar query to determine its impact in db prior to executing.

Is there any better way than regex? How it's done on data.SE? Is there anything else I should be worried about?

Update 1

These two rules can be applied by restricting user permission in database server.

  • Some tables' data will not be accessible due to privacy region.
  • Only SELECT statements will be allowed to be used.

So only problem is to validate the SQL

Update 2

I know If I write an SQL parser it'll do. It'll be like writing JS interpreter just remove eval() as its bad practice. So writing an SQL parser from scratch is really not an option. here.

Community
  • 1
  • 1
Shiplu Mokaddim
  • 56,364
  • 17
  • 141
  • 187
  • You need to parse query to tokens, sanitize them and assemble a query. And it's not that easy. "Is there any better way than regex?" -- If you need something to what SE provides - regex isn't suitable at all – zerkms Dec 26 '12 at 03:38
  • 4
    Can't wait to see some of the answers. – Mike Mackintosh Dec 26 '12 at 03:39
  • @zerkms I know that. I am trying to write one. But I am looking for any existing solution. – Shiplu Mokaddim Dec 26 '12 at 03:39
  • If you prepare and describe the statement, you can ensure that it is a SELECT and establish the column count a lot more easily than by parsing. Why 'not COUNT' as opposed to 'no aggregates'? I can circumvent 'not COUNT' by using 'SUM(1)' instead. Aggregates will be detectable with a moderately complex regex if necessary; LIMIT clauses can be detected likewise. – Jonathan Leffler Dec 26 '12 at 06:41
  • \#4 and \#5 should be easy to implement with database server permissions (run the queries as a user that can access only certain tables, and that can only SELECT). – nobody Dec 26 '12 at 03:48

1 Answers1

1

You can create new grammar for yacc, that contains subset of SQL. Than you able to check input SQL with this grammar. SQL lex yacc grammar

Community
  • 1
  • 1
Kostia Shiian
  • 1,024
  • 7
  • 12