Questions tagged [antlr4]

Version 4 of ANother Tool for Language Recognition (ANTLR), a flexible lexer/parser generator. ANTLR4 features an enhanced adaptive LL(*) parsing algorithm, that improves on the simpler LL(*) algorithm used in ANTLR3.

ANTLR stands for ANother Tool for Language Recognition, a powerful parser generator for reading, processing, executing, or translating structured text or binary files. At its core, ANTLR uses a grammar, with syntax loosely based on Backus–Naur_Form, to generate a parser. That parser produces easily traversable parse trees, which can be processed further by the user. ANTLR's simplistic and powerful design has allowed it to be used in many projects, from the expression evaluator in Apple's Numbers application1, to IntelliJ's IDEA IDE2.

The main improvement between ANTLR4 and ANTLR3 is a change in the parsing algorithm. This new variation of the LL(*) parsing algorithm, coined adaptive LL(*), pushes all of the grammar analysis effort to runtime, making ANTLR able to handle left recursive rules. This new resilience lead to the name "Honey Badger", on which Terence Parr had this to say:

ANTLR v4 is called the honey badger release after the fearless hero of the YouTube sensation, "The Crazy Nastyass Honey Badger". To quote the honey badger, ANTLR v4 just doesn't give a damn. It's pretty bad ass. It'll take just about any grammar you give it at parse correctly. And, without backtracking!*

-- Terence Parr

(To read more, check out the full conversation!)

If you are interested in learning to use ANTLR4, a good place to start would be the official documentation, which provides an excellent introduction to the library itself.

Further Reading:

1 Sourced from a paper written by Terrence Parr himself.

2 Sourced from Jetbrain's official list of third party software in IDEA.

3 On January 24th 2013, the www.antlr.org address was changed from pointing at site for ANTLR version 3 (www.antlr3.org) to ANTLR version 4 (www.antlr4.org). So questions and answers that used www.antlr.org were correct for ANTLR 3.x before this date. The links should be updated to www.antlr3.org for ANTLR 3.x or www.antlr4.org for ANTLR 4.x.

3877 questions
1
vote
1 answer

How do efficiently differentiate between different tokens in a rule in ANTLR4?

I have a simple grammar rule: expr : expr (EQUALS | NOT_EQUALS) expr | literal; literal : ...; // omitted here The lexer recognizes EQUALS and NOT_EQUALS: EQUALS : '='; NOT_EQUALS : '!='; In my code, I want to differentiate between…
1
vote
3 answers

Antlr4 use TokenStreamRewriter in ParseTreeWalker

Goal : To replace modify text, which is a rule I defined in .g4 file, that will enter and exit in my listener class in input String My code : def textModify(input: String) = { val loadLexer = new DSLSQLLexer(new ANTLRInputStream(input)) …
AI Joes
  • 69
  • 11
1
vote
1 answer

How an ANTLR visitor or listener can be written with async/await on its methods?

I am creating a grammar to compile the parser as a JavaScript parser. Then I would like to use async/await to call asynchronous functions within the visitor or listener. As the default generated code does not include async in the functions, await is…
jordiburgos
  • 5,964
  • 4
  • 46
  • 80
1
vote
1 answer

Does the JVM get used in an ANTLR4 c++ program at runtime?

So, the antlr4 C++ god's (Mike Lischke's) website states that everything in the parser was translated to C++. As such, what exactly does the jar do in the c++ implementation? More importantly, does my resulting program require the JVM after…
MikhailS
  • 25
  • 5
1
vote
1 answer

antlr4: how to make identifier case insensitive

deficiency is a keyword in my DSL, I want to make keywords case insensitive. I have read this doc and try. https://github.com/antlr/antlr4/blob/master/doc/case-insensitive-lexing.md In my grammer, I have two basic rules: matching_rule_not_work and…
hellojinjie
  • 1,868
  • 3
  • 17
  • 23
1
vote
1 answer

Switching streams in Lexer for ANTLR4

I am trying to implement an include feature in the lexer so that when it hits '#include "filename"' it will switch to a stream of that file. I got it working using a lexer action shown below. When I run it it seg faults. antlr4::ANTLRInputStream…
MikhailS
  • 25
  • 5
1
vote
1 answer

ANTLR4 lexer rules not matching correct block of text

I am trying to understand how ANTLR4 works based on lexer and parser rules but I am missing something in the following example: I am trying to parse a file and match all mathematic additions (eg 1+2+3 etc.). My file contains the following…
bettas
  • 195
  • 1
  • 2
  • 11
1
vote
0 answers

Python ANTLR4 extraneous input plus tokens removal

I am trying to parse a text file and I want to create a grammar to catch specific text blocks let's say a) the word 'specificWordA' or 'specWordB' followed by zero or more digits, or b) the word 'testC' followed by 1 or more digits. My grammar…
bettas
  • 195
  • 1
  • 2
  • 11
1
vote
3 answers

Antlr4: How to parse only one part of the file

Is it possible to parse only let's say the first half of the file with antlr4? I am parsing large files and I am using UnbufferedCharStream and UnbufferedTokenStream. I am not building a parse tree and I am using parse actions instead of…
1
vote
1 answer

ANTLR4 input grammar

How can I write this grammar expression for ANTLR4 input? Originally expression: = 0|(1 -9){0 -9} = ’( ESC |~( ’|\| LF | CR )) ’ = "{ ESC |~("|\| LF | CR )}" I tried the following…
1
vote
1 answer

ANTLR 4 / Parser recognizes erroneous expression as valid

Grammar file Expr.g4: grammar Expr; expr: expr ('*'|'/'|'+'|'-'|'%') expr | '(' expr ')' | INT ; INT : [0-9]+ ; WS : [ \t\n]+ -> skip ; I use the current ANTLR-Version 4.7.1: In ./bashrc: alias antlr4='java -jar…
D.De
  • 13
  • 2
1
vote
1 answer

How to handle Optional Grammar Blocks with a ANTLR-Visitor?

It is possible that this question has been asked before but i cannot find it. So if you guys find something similar, please let me know. According to the following Rule: fix_body : ident binders (annotation)? (':' term)? ':=' fix_body_term; I…
Tilman Zuckmantel
  • 643
  • 1
  • 6
  • 18
1
vote
1 answer

Writing custom parser using ANTLR4, C# and VS2017

I am trying to parse files that have a format as below. What I'd like to be able to do is create a few vars and an array of structs to contain information about the file. For example there could be (pseudocode) int atomNumber = 27 and then string…
Chylomicron
  • 251
  • 1
  • 3
  • 10
1
vote
1 answer

Antlr grammar not matching expected lexer rule

I'm trying to match a duration string, like for 30 minutes or for 2 hours using the following rules: durationPhrase: FOR_STR (MINUTE_DURATION | HOUR_DURATION); MINUTE_DURATION: NONZERO_NUMBER MINUTE_STR; HOUR_DURATION: NONZERO_NUMBER…
Craig Otis
  • 31,257
  • 32
  • 136
  • 234
1
vote
2 answers

Antlr lexer matching unintended rule

I'm re-learning some basic Antlr and trying to write a grammar to generate todo items: Meeting at 12pm for 20 minutes The issue I'm having is that three lexer rules in particular are getting "mismatched" depending on the context in which they're…
Craig Otis
  • 31,257
  • 32
  • 136
  • 234