As a complete beginner in antlr4, I haven't been able to make any use of the answer to a similar question. It looks to me that fragments are only called in my grammar by terminal rules, but still the parser is throwing the following error when submitted the string "myIdentifier":
line 1:0 token recognition error at: 'm'
line 1:1 token recognition error at: 'y'
line 1:2 token recognition error at: 'I'
line 1:3 token recognition error at: 'd'
line 1:4 token recognition error at: 'e'
line 1:5 token recognition error at: 'n'
line 1:6 token recognition error at: 't'
line 1:7 token recognition error at: 'i'
line 1:8 token recognition error at: 'f'
line 1:9 token recognition error at: 'i'
line 1:10 token recognition error at: 'e'
line 1:11 token recognition error at: 'r'
My grammar is this:
grammar Sable;
options {
}
@header {
package org.sable.parser.gen;
}
IDENTIFIER:
(IdentifierHead IdentifierCharacter*)
| ('`'(IdentifierHead IdentifierCharacter*)'`')
;
WS : [ \u0020\u000C\u000A\u000D\u0009u000B\u000C]+ -> skip
;
COMMENT
: '/*' .*? '*/' -> channel(HIDDEN)
;
LINE_COMMENT
: '//' ~[\u000A\u000D]* -> channel(HIDDEN)
;
// NOTE: a file with zero statements is allowed because
// it can contain just comments.
sourceFile:
statement* EOF;
statement:
expression ';'?;
// Req. not existing any valid expression starting from
// an equals sign or any other assignment operator.
expression:
valuedExpression (assignmentOperator valuedExpression)?;
valuedExpression:
IDENTIFIER
;
assignmentOperator:
'='
| '*='
| '/='
| '%='
| '+='
| '-='
| '<<='
| '>>='
| '&='
| '^='
| '|='
;
fragment DecimalDigit:
'0'..'9'
;
fragment IdentifierHead:
'a'..'z'
| 'A'..'Z'
| '_'
| '\u00A8'
| '\u00AA'
| '\u00AD'
| '\u00AF' |
'\u00B2'..'\u00B5' |
'\u00B7'..'\u00BA' |
'\u00BC'..'\u00BE' |
'\u00C0'..'\u00D6' |
'\u00D8'..'\u00F6' |
'\u00F8'..'\u00FF' |
'\u0100'..'\u02FF' |
'\u0370'..'\u167F' |
'\u1681'..'\u180D' |
'\u180F'..'\u1DBF' |
'\u1E00'..'\u1FFF' |
'\u200B'..'\u200D' |
'\u202A'..'\u202E' |
'\u203F'..'\u2040' |
'\u2054' |
'\u2060'..'\u206F' |
'\u2070'..'\u20CF' |
'\u2100'..'\u218F' |
'\u2460'..'\u24FF' |
'\u2776'..'\u2793' |
'\u2C00'..'\u2DFF' |
'\u2E80'..'\u2FFF' |
'\u3004'..'\u3007' |
'\u3021'..'\u302F' |
'\u3031'..'\u303F' |
'\u3040'..'\uD7FF' |
'\uF900'..'\uFD3D' |
'\uFD40'..'\uFDCF' |
'\uFDF0'..'\uFE1F' |
'\uFE30'..'\uFE44' |
'\uFE47'..'\uFFFD'
;
fragment IdentifierCharacter:
DecimalDigit
| '\u0300'..'\u036F'
| '\u1DC0'..'\u1DFF'
| '\u20D0'..'\u20FF'
| '\uFE20'..'\uFE2F'
| IdentifierHead
;
What am I doing wrongly? My assumptions are:
- IDENTIFIER is a terminal
- IdentifierHead and IdentifierCharacter are fragments
- The rest are all parse rules.