5

The github sqlite3 grammar github.com/antlr/grammars-v4/blob/master/sqlite/SQLite.g4 has issue with left join.

For this sql

select * from t1 left join t2 on t1.owner = t2.email

the word 'left' is parsed as a table_alias. Things go rapidly downhill from there

I think I can fix it by somehow saying that table_alias is any_name except K_LEFT, K_RIGHT, K_INNER but I do not know how to express that in a grammar

Or maybe there is a better way to fix this

UPDATE. Just to clarify sqlite behavior. (Not talking about antlr , talking about what sqlite understands)

this

select username from user left join device on device.owner = user.id limit 2

works.

This

select username from user alias join device on device.owner = user.id limit 2

fails saying that the column user.id doesn't exist.

Clearly the word 'left' is being recognized as a keyword not a table alias.

select username from user alias join device on device.owner = alias.id limit 2

Works, and does an inner join

select username from user alias left join device on device.owner = user.id limit 2

works and does a left join

Further update for @Mike Lischke

select username from user left where device.owner = left.id limit 2

Fails. "Query has failed: near "where": syntax error".

pm100
  • 48,078
  • 23
  • 82
  • 145
  • Are you writing the grammar from scratch ? Did you take a look at https://github.com/antlr/grammars-v4/blob/master/sqlite/SQLite.g4 ? – YaFred Jul 10 '18 at 15:49
  • that is the grammar i am referring to - updated the question – pm100 Jul 10 '18 at 17:35
  • 1
    Silly question, but does that also happen when you use capitals for all the reserved keywords? – Xedret Jul 10 '18 at 17:45

1 Answers1

0

The behavior is correct. LEFT is included in the keyword rule which in turn is used in any_name, which is valid input for table_alias (this is known as the keywords-as-identifiers problem). Your query is wrong, according to the grammar, period. Any word following the table name is a table alias, regardless whether it is usually a keyword or not.

A workaround could be to add a validating predicate, which checks the following token and fails the table_alias rule if the current token is something that should be used in its original meaning. You can get the same result by removing the LEFT keyword from the keyword rule. However, that might have other side effects in places where you want the LEFT keyword to be valid input for an identifier.

Mike Lischke
  • 48,925
  • 16
  • 119
  • 181
  • well sqlite accepts the query and executes it as though it were a left join, not as a table alias (I cannot refer to the table by the name 'left', adding 'as' before the word 'left' makes it into an alias). I assume that it knows that several keywords are allowed after a tables name: as, left, right, inner and it checks those b4 it triggers the table alais rule – pm100 Jul 11 '18 at 17:46
  • Can you have `LEFT` alone after the table name (i.e. without `JOIN`)?I believe the SQLite parser either changes the meaning of certain keywords depending on some condition (non-context-free parser) or these keywords are not allowed in such contexts like the table alias (and the grammar you use is wrong). – Mike Lischke Jul 11 '18 at 17:55
  • You didn't mention the case I asked for, e.g. `select username from user left where device.owner = left.id limit 2`. If `left` is allowed as normal identifier this will still work, making `left` the alias for `user`. If it fails it does probably for the incomplete `left join` phrase and it means the ANTLR4 grammar is wrong. – Mike Lischke Jul 13 '18 at 06:30
  • See updated question. TL;DR. That sql fails with syntax error – pm100 Jul 13 '18 at 16:18
  • That's what I expected. `LEFT` alone is not allowed (which would if this keyword would really be allowed as identifier there). That means the g4 grammar is wrong. You have to change it. Just take out the few `JOIN` keywords from the `keywords` rule. You may have to create another similar rule which includes them for other rules. – Mike Lischke Jul 14 '18 at 10:42