Questions tagged [antlr4]

Version 4 of ANother Tool for Language Recognition (ANTLR), a flexible lexer/parser generator. ANTLR4 features an enhanced adaptive LL(*) parsing algorithm, that improves on the simpler LL(*) algorithm used in ANTLR3.

ANTLR stands for ANother Tool for Language Recognition, a powerful parser generator for reading, processing, executing, or translating structured text or binary files. At its core, ANTLR uses a grammar, with syntax loosely based on Backus–Naur_Form, to generate a parser. That parser produces easily traversable parse trees, which can be processed further by the user. ANTLR's simplistic and powerful design has allowed it to be used in many projects, from the expression evaluator in Apple's Numbers application1, to IntelliJ's IDEA IDE2.

The main improvement between ANTLR4 and ANTLR3 is a change in the parsing algorithm. This new variation of the LL(*) parsing algorithm, coined adaptive LL(*), pushes all of the grammar analysis effort to runtime, making ANTLR able to handle left recursive rules. This new resilience lead to the name "Honey Badger", on which Terence Parr had this to say:

ANTLR v4 is called the honey badger release after the fearless hero of the YouTube sensation, "The Crazy Nastyass Honey Badger". To quote the honey badger, ANTLR v4 just doesn't give a damn. It's pretty bad ass. It'll take just about any grammar you give it at parse correctly. And, without backtracking!*

-- Terence Parr

(To read more, check out the full conversation!)

If you are interested in learning to use ANTLR4, a good place to start would be the official documentation, which provides an excellent introduction to the library itself.

Further Reading:

1 Sourced from a paper written by Terrence Parr himself.

2 Sourced from Jetbrain's official list of third party software in IDEA.

3 On January 24th 2013, the www.antlr.org address was changed from pointing at site for ANTLR version 3 (www.antlr3.org) to ANTLR version 4 (www.antlr4.org). So questions and answers that used www.antlr.org were correct for ANTLR 3.x before this date. The links should be updated to www.antlr3.org for ANTLR 3.x or www.antlr4.org for ANTLR 4.x.

3877 questions
1
vote
1 answer

ANTLR4: rule 'RULE' contains a closure with at least one alternative that can match an empty string

I am writing a file parser with ANTLR4. The file can have a number of blocks, which all begin and end with a (BEGIN | END) keyword. Here is a very simple example: grammar test; BEGIN: 'BEGIN'; END: 'END'; HEADER:'HEADER'; BODY: 'BODY'; file:…
Paul Würtz
  • 1,641
  • 3
  • 22
  • 35
1
vote
1 answer

Want to extract the table names and column names from SQL Statement

I am new to ANTLR, I got SQLite grammar from Github and I am able to generate the Lexer and Parser for that in Java. I am trying to parse the SQL statement and trying to get Table names and Column names out, but I am getting the total Statement…
1
vote
1 answer

Removal of indirect left recursion (I don't understand formal symbols)

I've tried looking for answers to my solution but I can't seem to wrap my head around the generalized solutions. It doesn't help that I can't figure out which of my elements map to capital letters and which are supposed to be represented by small…
markonius
  • 625
  • 6
  • 25
1
vote
1 answer

Token with different interpretations (i.e. keyword and identifier)

I am writing a grammar with a lot of case-insensitive keywords in ANTLR4. I collected some example files for the format, that I try to test parse and some use the same tokens which exist as keywords as identifiers in other places. For example there…
Paul Würtz
  • 1,641
  • 3
  • 22
  • 35
1
vote
1 answer

The parser didn't comsume all tokens,Is it a bug?

env: antlr 4.7.1 the grammer is: grammar Whilelang; program : seqStatement; seqStatement: statement (';' statement)* ; statement: ID ':=' expression # attrib | 'print' Text # print |…
1
vote
2 answers

Antlr - mismatched input error - token not recognised

I have the following ANTLR grammar. grammar DDGrammar; ddstmt: dd2 EOF; dd2: splddstart inlinerec; splddstart: '//' NAME DDWORD '*' NL; inlinerec: NON_JCL_CARD* END_OF_FILE ; DDWORD:'DD'; //DUMMYWORD: 'DUMMY'; NAME: [A-Z@#$]+; NON_JCL_CARD :…
ssdimmanuel
  • 448
  • 10
  • 29
1
vote
1 answer

ANTLR stop at first occurence

I have these rules: while: 'while' expr 'do' program; if: 'if' expr 'then' program 'else' program; I don't care what expr contains, so how can I take everything there until then or do? I tried: expr: .*?~('then'|'do'); but it is not working. Why?
ddd
  • 13
  • 5
1
vote
1 answer

Antlr4 Testrig doesn't return anything

This is the first time I use Antlr4 and I have a question regarding to the test rig. I've finished the installation process and try out the sample given in both Antlr4 main site and the github page. Here's what I've…
1
vote
1 answer

ANTLR4 - how to interrupt

Suppose a line has a maximum length of 5. I want an Identifier to continue when a newline character is put on position 5. examples: abcd'\n'ef would result in a single Identifier "abdef" ab'\n'def would result in Identifier "ab" (and another one…
1
vote
1 answer

What is invokingState in RuleContext class implementations?

I see the Rule Indexes in JavaParser.java, but there is another integer value i.e. invoking state. Is this invoking state related to the getStartToken or how is it different from rule indexes?
1
vote
1 answer

Antlr 4 get (print) all parse trees when there is ambiguity

Consider the following ANTLR 4 grammar: grammar Test; start: e EOF; e : e '+' e #op | NUMBER #atom ; NUMBER: [0-9]+; Based on the disambiguation rules of ANTLR, in this case binary operators being left associative, the result of…
Wickoo
  • 6,745
  • 5
  • 32
  • 45
1
vote
1 answer

antlr - specify a parser rule with any sequence

I have a section of a ALTLR grammar which goes like this: mainfilter: mandatoryfilter (optionalfilter1)? (optionalfilter2)? (optionalfilter3)? ; mandatoryfilter: 'NAME' '=' ID; optionalfilter1: 'VALUE1' EQ ID; optionalfilter2: 'VALUE2' EQ ID;…
ssdimmanuel
  • 448
  • 10
  • 29
1
vote
1 answer

How to implement substraction expression in my ANTLR4 Java extended Listener class?

I have a task to remake my ANTLR4 Java project, which was using visitor, into the same project, which is using listener. I am having trouble with understanding how listener works. My visitor substract method looked like this: // expression '-'…
1
vote
1 answer

Write to the order not be optional in ANTLR 4

I'm writing the ANTLR to create my query syntax, so the script bellows should be right and pass: select * from Person select name,age from Person select _id,name,age from Person select name,age,adress.age from Person select name , age from…
Otávio Santana
  • 318
  • 2
  • 12
1
vote
1 answer

Running TestRig with wrong input string does not emit error message

I made a grammar which constructs comparison between expressions as follows. grammar EtlExpression; /** The start rule; begin parsing here. */ prog : comp ; comp : expr ('='|'<='|'>='|'<'|'>') expr ; expr : expr ('*'|'/') expr |…
류상욱
  • 11
  • 1