6

I'm (now) trying to use ANTLR4 and C# to design a language, and so far I've been fiddling around with it. In the process, I decided to try and create a simple mathematical expression evaluator. In the process, I created the following ANTLR grammar for it:

grammar Calculator;

@parser::members
{
    protected const int EOF = Eof;
}

@lexer::members
{
    protected const int EOF = EOF;
    protected const int HIDDEN = Hidden;
}

program : expr+ ;

expr : expr op=('*' | '/') expr
     | expr op=('+' | '-') expr
     | INT
     | '(' expression ')'
     ;

INT : [0-9]+ ;
MUL : '*' ;
DIV : '/' ;
ADD : '+' ;
SUB : '-' ;
WS : (' ' | '\r' | '\n') -> channel(HIDDEN) ;

When I try to generate C# code from it using this command:

java -jar C:\...\antlr-4.2-complete.jar -DLanguage=CSharp .\...\Grammar.g4

I get these odd errors:

error(50): C:\Users\Ethan\Documents\Visual Studio 2015\Projects\CypressLang\CypressLang\Source\.\Grammar\CypressGrammar.g4:1:0: syntax error: 'ï' came as a complete surprise to me    
error(50): C:\Users\Ethan\Documents\Visual Studio 2015\Projects\CypressLang\CypressLang\Source\.\Grammar\CypressGrammar.g4:1:1: syntax error: '»' came as a complete surprise to me    
error(50): C:\Users\Ethan\Documents\Visual Studio 2015\Projects\CypressLang\CypressLang\Source\.\Grammar\CypressGrammar.g4:1:2: syntax error: '¿' came as a complete surprise to me  
error(50): C:\Users\Ethan\Documents\Visual Studio 2015\Projects\CypressLang\CypressLang\Source\.\Grammar\CypressGrammar.g4:1:3: syntax error: mismatched input 'grammar' expecting SEMI

What might be causing these errors, and how can I fix them? My best guess at the moment is that Visual Studio is inserting odd characters onto the beginning of the file, and I can't remove them.

Ethan Bierlein
  • 3,353
  • 4
  • 28
  • 42

1 Answers1

6

Today is not a good day.

Visual Studio decided to mess with me and change my file formats to UTF-8 for all of my files. All I needed to do was go to File > Advanced Save Settings, and change the encoding to US-ASCII. This removed the odd characters inserted at the beginning, and solved (most) of my problems.

Ethan Bierlein
  • 3,353
  • 4
  • 28
  • 42
  • 3
    These "odd" characters is the socalled BOM ([byte order mark](https://en.wikipedia.org/wiki/Byte_order_mark)). The file has been stored in UTF-8 probably where the BOM consists of 3 bytes (those you got the warning for). You can store a Unicode file with or w/o BOM and don't have to return to ASCII encoding, just to get rid of the BOM. – Mike Lischke Oct 31 '15 at 09:24
  • @Ethan...Thank the Lord I find your answer. Save me hours of digging – zAnthony Aug 09 '21 at 12:48