Questions tagged [antlr4]

Version 4 of ANother Tool for Language Recognition (ANTLR), a flexible lexer/parser generator. ANTLR4 features an enhanced adaptive LL(*) parsing algorithm, that improves on the simpler LL(*) algorithm used in ANTLR3.

ANTLR stands for ANother Tool for Language Recognition, a powerful parser generator for reading, processing, executing, or translating structured text or binary files. At its core, ANTLR uses a grammar, with syntax loosely based on Backus–Naur_Form, to generate a parser. That parser produces easily traversable parse trees, which can be processed further by the user. ANTLR's simplistic and powerful design has allowed it to be used in many projects, from the expression evaluator in Apple's Numbers application1, to IntelliJ's IDEA IDE2.

The main improvement between ANTLR4 and ANTLR3 is a change in the parsing algorithm. This new variation of the LL(*) parsing algorithm, coined adaptive LL(*), pushes all of the grammar analysis effort to runtime, making ANTLR able to handle left recursive rules. This new resilience lead to the name "Honey Badger", on which Terence Parr had this to say:

ANTLR v4 is called the honey badger release after the fearless hero of the YouTube sensation, "The Crazy Nastyass Honey Badger". To quote the honey badger, ANTLR v4 just doesn't give a damn. It's pretty bad ass. It'll take just about any grammar you give it at parse correctly. And, without backtracking!*

-- Terence Parr

(To read more, check out the full conversation!)

If you are interested in learning to use ANTLR4, a good place to start would be the official documentation, which provides an excellent introduction to the library itself.

Further Reading:

1 Sourced from a paper written by Terrence Parr himself.

2 Sourced from Jetbrain's official list of third party software in IDEA.

3 On January 24th 2013, the www.antlr.org address was changed from pointing at site for ANTLR version 3 (www.antlr3.org) to ANTLR version 4 (www.antlr4.org). So questions and answers that used www.antlr.org were correct for ANTLR 3.x before this date. The links should be updated to www.antlr3.org for ANTLR 3.x or www.antlr4.org for ANTLR 4.x.

3877 questions
1
vote
1 answer

Antlr4 token ambiguity for single character

I have a problem with the rule mnemonic_format. Instead to recognize a simple text like A100 it gives the following error : mismatched input 'A100' expecting 'A' The grammar is: grammar SimpleMathGrammar; INTEGER : [0-9]+; FLOAT :…
Massimo
  • 43
  • 7
1
vote
1 answer

Indentation management in ANTLR4 for a python interpreter

I'm implementing a python interpreter using ANTLR4 like lexer and parser generator. I used the BNF defined at this link: https://github.com/antlr/grammars-v4/blob/master/python3/Python3.g4. However the implementation of indentation with the INDENT…
Marco
  • 27
  • 6
1
vote
0 answers

Error when compiling java file in cmd

I am not sure why but I am getting error when trying to run my java file in cmd using javac MyCompiler.java. The file runs fine when I run it from Intellij. I have already added C:\Program Files\Java\jdk1.8.0_144\bin to class path. The error I am…
AaySquare
  • 123
  • 10
1
vote
0 answers

ANTLR4 CPP target visitor bad_cast error

I'm trying to use a custom visitor class for a simple expression grammar. // .h class MyVisitor: public MyParserBaseVisitor {...} // .cpp Any MyVisitor::visitExpr(MyParser::ExprContext *ctx) { auto result = visitChildren(ctx); …
daycoder
  • 21
  • 3
1
vote
1 answer

ANTLR4 no viable alternative at input after adding parser rule

I'm trying to define the language of XQuery and XPath in test.g4. The part of the file relevant to my question looks like: grammar test; ap: 'doc' '(' '"' FILENAME '"' ')' '/' rp | 'doc' '(' '"' FILENAME '"' ')' '//' rp ; rp: ...; f: ...; xq:…
paranoider
  • 27
  • 2
1
vote
1 answer

Mutually left-recursive lexer rules on ANTL4?

I'm trying to write Swift language highlight. Also I would like to highlight in addition to tokens of some language constructs. Having problems with the following rule: Type : '[' Type ']' | '[' Type ':' Type ']' | (Attributes?…
jeudesprit
  • 25
  • 5
1
vote
2 answers

Recognizing a REAL value

I am trying to recognize real values (such as xxx.xx) This grammar does not work grammar Test; realValue: NUMBER DOT DECIMALS ; DOT: '.' ; NUMBER: '0' | ('1'..'9')('0'..'9')* ; DECIMALS: ('0'..'9')('0'..'9')* ; WS: ('…
YaFred
  • 9,698
  • 3
  • 28
  • 40
1
vote
1 answer

Exclude some characters in Unicode category

I'm trying to implement a rule along the lines of "all characters in the Letter and Symbol Unicode categories except a few reserved characters." From the lexer rules, I know I can use \p{___} to match against Unicode categories, but I am unsure of…
Panda
  • 877
  • 9
  • 21
1
vote
2 answers

ANTLR4: How to get the position in the source with python3

I would like to use ANTLR4 to analyze COBOL files using a Python3 program. To do so, I would need to know the position on which the token (lets say a MOVE statement) occurs in the file (at least the line and if possible also the character position).…
jottbe
  • 4,228
  • 1
  • 15
  • 31
1
vote
2 answers

How to implement this rule in ANTLR4?

How to implement this rule in ANTLR4: multiline-comment-text-item -> Any Unicode scalar value except /* or */ ?
jeudesprit
  • 25
  • 5
1
vote
1 answer

undescores seen as white spaces. Is it normal?

In my grammar, I have this for white spaces: WS: (' '|'\r'|'\t'|'\n') -> skip ; However, the parser does not choke if I put an undescore instead of a space. My-first-module_DEFINITIONS_::= is recognized as My-first-module DEFINITIONS ::= Is…
YaFred
  • 9,698
  • 3
  • 28
  • 40
1
vote
1 answer

ANTLR Visitor of a rule with alternatives

I've a grammar rule like this: fctDecl : id AS FUNCTION O_PAR (argDecl (COMMA argDecl)*)? C_PAR COLON (scalar|VOID) (DECLARE LOCAL (varDecl SEMICOLON)+)? DO (instruction)+ (RETURN id)? DONE; When i'm in the visitFctDecl, I have to visit…
Yvkevitch
  • 65
  • 8
1
vote
1 answer

/s/S in ANTLR parser rules

I want to write a parser rule to parse a valid String, my rule goes like this: STRING: '"' [\s\S]+ '"'; But it gives me a warning saying invalid escape sequence \s. I tried other escape sequence like \t, \n... they are all fine. Can anyone tell me…
paranoider
  • 27
  • 2
1
vote
0 answers

ANTLR4 grammar for postfix integer arithmetic

Very new to ANTLR4 and trying to make an expression analyser for the target language Forth. As Forth uses postfix notation, I am trying to write grammar rules for postfix integer arithmetic. Below is the grammar for both infix and postfix integer…
1
vote
2 answers

How to get context / line number in ANTLR 4 parser rule?

Take this rule / catch for example: section : (title sstart row+ send); catch[Exception e] {System.out.println("Notification: Problem on line " + *line # here*); System.exit(0);} How could I get the line number of the token that threw the…
Tristan
  • 1,608
  • 1
  • 20
  • 34
1 2 3
99
100