Questions tagged [antlr4]

Version 4 of ANother Tool for Language Recognition (ANTLR), a flexible lexer/parser generator. ANTLR4 features an enhanced adaptive LL(*) parsing algorithm, that improves on the simpler LL(*) algorithm used in ANTLR3.

ANTLR stands for ANother Tool for Language Recognition, a powerful parser generator for reading, processing, executing, or translating structured text or binary files. At its core, ANTLR uses a grammar, with syntax loosely based on Backus–Naur_Form, to generate a parser. That parser produces easily traversable parse trees, which can be processed further by the user. ANTLR's simplistic and powerful design has allowed it to be used in many projects, from the expression evaluator in Apple's Numbers application1, to IntelliJ's IDEA IDE2.

The main improvement between ANTLR4 and ANTLR3 is a change in the parsing algorithm. This new variation of the LL(*) parsing algorithm, coined adaptive LL(*), pushes all of the grammar analysis effort to runtime, making ANTLR able to handle left recursive rules. This new resilience lead to the name "Honey Badger", on which Terence Parr had this to say:

ANTLR v4 is called the honey badger release after the fearless hero of the YouTube sensation, "The Crazy Nastyass Honey Badger". To quote the honey badger, ANTLR v4 just doesn't give a damn. It's pretty bad ass. It'll take just about any grammar you give it at parse correctly. And, without backtracking!*

-- Terence Parr

(To read more, check out the full conversation!)

If you are interested in learning to use ANTLR4, a good place to start would be the official documentation, which provides an excellent introduction to the library itself.

Further Reading:

1 Sourced from a paper written by Terrence Parr himself.

2 Sourced from Jetbrain's official list of third party software in IDEA.

3 On January 24th 2013, the www.antlr.org address was changed from pointing at site for ANTLR version 3 (www.antlr3.org) to ANTLR version 4 (www.antlr4.org). So questions and answers that used www.antlr.org were correct for ANTLR 3.x before this date. The links should be updated to www.antlr3.org for ANTLR 3.x or www.antlr4.org for ANTLR 4.x.

3877 questions
7
votes
2 answers

Is there a parser equivalent of 'fragment' marking in ANTLR4?

Is there a way to tell ANTLR4 to inline the parser rule? It seems reasonable to have such feature. After reading the book on ANTLR ("The Definitive ANTLR 4 Reference") I haven't found such possibility, but changes might've been introduced in the 4…
ABW
  • 151
  • 4
7
votes
1 answer

ANTLR4 + Python parsing from string instead of path

I am using ANTLR4 with Python and I am currently using the following code for parsing: lexer = MyGrammarLexer(FileStream(path)) stream = CommonTokenStream(lexer) parser = MyGrammarParser(stream) return parser.start().object However, I would like to…
ec-m
  • 779
  • 1
  • 5
  • 15
7
votes
3 answers

Antlr4: The following sets of rules are mutually left-recursive

I am trying to describle simple grammar with AND and OR, but fail with the following error The following sets of rules are mutually left-recursive The grammar is following: expr: NAME | and | or; and: expr AND expr; or: expr OR…
Dims
  • 47,675
  • 117
  • 331
  • 600
7
votes
2 answers

Syntactic predicates in ANTLR lexer rules

Introduction Looking at the documentation, ANTLR 2 used to have something called predicated lexing, with examples like this one (inspired by Pascal): RANGE_OR_INT : ( INT ".." ) => INT { $setType(INT); } | ( INT '.' ) => REAL {…
MvG
  • 57,380
  • 22
  • 148
  • 276
7
votes
1 answer

Unindented code breaks my grammar

I have a .g4 grammar for vba/vb6 a lexer/parser, where the lexer is skipping line continuation tokens - not skipping them breaks the parser and isn't an option. Here's the lexer rule in question: LINE_CONTINUATION : ' ' '_' '\r'? '\n' -> skip; The…
Mathieu Guindon
  • 69,817
  • 8
  • 107
  • 235
7
votes
1 answer

How can I determine which alternative node was chosen in ANTLR

Suppose I have the following: variableDeclaration: Identifier COLON Type SEMICOLON; Type: T_INTEGER | T_CHAR | T_STRING | T_DOUBLE | T_BOOLEAN; where those T_ names are just defined as "integer", "char" etc. Now suppose I'm in the…
David
  • 5,991
  • 5
  • 33
  • 39
7
votes
1 answer

Python 2.7 & ANTLR4 : Make ANTLR throw exceptions on invalid input

I want to catch errors like line 1:1 extraneous input '\r\n' expecting {':', '/',} line 1:1 mismatched input 'Vaasje' expecting 'Tafel' I tried wrapping my functions in try-catch but, as expected, these errors are just print statement and not…
Emiel Steerneman
  • 382
  • 1
  • 5
  • 12
7
votes
2 answers

Antlr4 C++ target

We're starting a project where we will need to parse python source files in a C++ application. I've used Antlr2 a while back to generate a few compilers, but this is the first time I'm using Antlr4. It looks like the c++ antlr4 target is fairly…
Andy Somogyi
  • 81
  • 1
  • 4
7
votes
1 answer

Antlr4 - Implicit Definitions

I am trying to create a simple for now only integer-arithmetic expression parser. For now i have: grammar MyExpr; input: (expr NEWLINE)+; expr: '(' expr ')' | '-' expr | expr '^' expr | expr ('*' | '/') expr | expr…
NotMe NotYou
  • 111
  • 3
7
votes
3 answers

Use Visitor or Listener with ANTLR4 when returning objects of different types

I translate one language into another with ANTLR4. For example when I read numerical literals I can return an Integer or Double. @Override public Integer visitIntegerValue(Parser.IntegerValueContext ctx) { return…
vasily
  • 2,850
  • 1
  • 24
  • 40
7
votes
1 answer

How can I skip a parsing rule using ANTLR 4?

In the lexer, tokens can be skipped, keeping them out of the parser, like so: Whitespace : [ \t\r\n]+ -> skip ; Is there an equivalent to -> skip for the parser? That is, once a parser rule is matched, is there a way to keep it out of the parse…
james.garriss
  • 12,959
  • 7
  • 83
  • 96
7
votes
6 answers

Visitor methods for Java grammar not working in ANTLR 4.4

I am new to ANTLR framework. I have been working around this for a week. Now am in a situation where i need to parse the Java file and extract the data. Am using ANTLR 4 for parsing. I create the Lexer, Parser and Visitor files using ANTLR in built…
Narayana
  • 363
  • 3
  • 14
7
votes
2 answers

antlr 4 - warning: rule contains an optional block with at least one alternative that can match an empty string

I work with antlr v4 to write a t-sql parser. Is this warning a problem? "rule 'sqlCommit' contains an optional block with at least one alternative that can match an empty string" My Code: sqlCommit: COMMIT (TRAN | TRANSACTION | WORK)?…
phil
  • 1,289
  • 1
  • 15
  • 24
7
votes
0 answers

"Resource is currently edited in another editor" after saving ANTLR grammar file

I am using antlr4ide plugin with Eclipse Luna. Every time I save the ANTLR4 grammar file Eclipse shows a dialog box "The resource is currently edited in another editor. Do you want to continue?" Any hint on how to fix it?
7
votes
5 answers

Antlr generated classes access modifier to internal

I am building a library which contains certain parsers. These parsers are internally built with ANTLR4. Since the generated classes are all public, users of my library are able to see all the classes they do not need to see. Also the Sandcastle…
metacircle
  • 2,438
  • 4
  • 25
  • 39