0

From a previous question I'm wondering what the correct way to deal with the following race condition between whether an expression is a subselect or a parenthesized expression:

grammar Subselect;
statement: query EOF;
query
    : select
    | query 'UNION' query
    ;

select: 'SELECT' expr (',' expr)*;

expr
    : '1'                          # identifier
    | '(' expr ')'                 # parenExpr
    | '(' query ')'                # subSelect
    ;

WHITESPACE: [ \t\r\n] -> skip;

And from running SELECT ((SELECT 1)) I get:

enter image description here

What would be a suggested way to deal with this? (Note: the grammar above is a tremendous simplification and the select clause has many more components of it, but this is the simplest example I could use to show the issue that seems a bit insoluble to me with antlr4 -- hopefully an expert can help me solve it though!)

David542
  • 104,438
  • 178
  • 489
  • 842
  • 2
    This grammar is ambiguous, to be sure. But Antlr4 parses it correctly anyway, no? I mean, you get the expected parse tree. Perhaps the ambiguity causes a slight inefficiency in the parse, but it's hardly a common usage. So, unless it actually causes a problem, the suggested way to deal with this would be "don't worry about it". (Also, it's an ambiguity, not a "race condition". A race condition is when two processes concurrently access the same memory location without locking and at least one modifies the data in that location. Antlr does not parallelise parsing, afaik.) – rici Aug 29 '22 at 00:40
  • FWIW, I think that multiply parenthesised selects are not accepted by all SQL parsers. – rici Aug 29 '22 at 00:52
  • @rici what would be a way to modify the grammar to not accept multiply-parenthesized selects? I think I'd want to take that approach actually. – David542 Aug 29 '22 at 00:58
  • I'd have to see more of your grammar to even have a chance of making a sensible suggestion. Actually, I'm not sure about my last comment; I remember hitting an error like that once, but I don't entirely trust my memory any more :-) – rici Aug 29 '22 at 01:01
  • @rici I see, yea I tried a ton of different ways to do indirection and any ones that I had that I thought would 'work' were giving me left-recursion errors in antlr, probably due to the missing `(` token before `expr`. In the previous question I. gave a bit more of the grammar if that gives you enough to work with -- https://stackoverflow.com/questions/73508143/how-to-disambiguate-a-subselect-from-a-parenthesized-expression – David542 Aug 29 '22 at 02:37

1 Answers1

0

Not sure what you really want to achieve (solving the ambiquity doesn't give you much advantage), but to deal with that anyway: why do you use the same definition of an input sequence twice, in the first place? You could write:

grammar Subselect;
statement: query EOF;
query
    : select
    ;

select: select_clause;
select_clause: 'SELECT' expr (',' expr)*;

expr
    : '1'                          # identifier
    | '(' expr ')'                 # parenExpr
    | '(' query ')'                # subSelect
    ;

WHITESPACE: [ \t\r\n] -> skip;

This will allow only one pair of parentheses around a select statement.

Mike Lischke
  • 48,925
  • 16
  • 119
  • 181
  • I've modified the `query` rule in the question so it's more explicit -- basically that form allows having a set-operation at the top level, such as `SELECT ((SELECT 1 UNION SELECT 1))`. – David542 Aug 29 '22 at 18:52
  • any feedback on the update? – David542 Aug 30 '22 at 17:46
  • Oh that was an additional question? Sorry, I didn't see it as that, but just a comment. The query as it is now is even worse. It allows UNION for all query types, while it is only allowed for SELECT. I have given a possible solution in my answer. If that's not what you want then let me know. – Mike Lischke Aug 31 '22 at 06:54
  • could you please clarify what you mean for "all query type" ? Do you mean something like a `DELETE` or `ALTER` or something else? But yes, as it is now, our syntax only supports `SELECT` so it's a bit of a moot point wrt the question. – David542 Aug 31 '22 at 17:28