1

I have this basic JFlex lexer :

import java.util.*;
%%

%public
%class TuringLexer
%type Void

%init{
yybegin(YYINITIAL);
%init}

%state COMM, GETALPH, MT, PARSELOOP, PARSELEMS, PARSESYMB, PARSEMT
%{
  ArrayList<Character> alf = new ArrayList<Character>();   
  String crtMach;
  String crtLoop;
  String crtLoopContent;
  String crtLoopContentParam;
  String crtContent;
  String crtSymb;
%}

//Input = [^\r\n]
SEP = [:space:]*
//COMM =[;.*$] 
name = [A-Za-z_]*
tok=[A-Za-z0-9#$@\*]
AL = "alphabet :: "
cont = [^]]*
param =[^)]*
letter = [A-Za-z]
opn = [\[?]
symb = [^\}]+
%%
 <COMM> {
  "."  { /* ignore */  System.out.println("Got into comm state ");}
  "\n" {System.out.println("Got out of comm state ");yybegin(YYINITIAL);}
}
 <GETALPH> {
 {SEP} { /* ignore */ }
 {tok} { String str = yytext();
     System.out.println("Alphabet -- " + str);
     Character c = str.charAt(0);
     alf.add(c); }
 ";"  {yybegin(YYINITIAL);}

}
 <YYINITIAL> {
 "\n"   { /* ignore */ System.out.println("Got into YYINITIAL"); }
 ";"  { yybegin(COMM); }

[^]                    { throw new Error("Illegal character <"+yytext()+">"); }
}

Code has been removed for clarity, but the issue still persists so it is easier to identify it here.

this is the input file -> file is called simple.mt

And this is the main class :

import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.Reader;
import java.io.BufferedReader;
import java.io.FileReader;
public class MainClass  {
public static void main(String args[]) throws IOException {
    Reader reader = new BufferedReader(new FileReader ("simple.mt"));
    reader.read();
    TuringLexer tl = new TuringLexer(reader);
    tl.yylex();
}
}

When I run the project in eclipse ( or terminal, for that matter) I get:

Exception in thread "main" java.lang.Error: Illegal character <l>
    at TuringLexer.yylex(TuringLexer.java:576)
    at MainClass.main(MainClass.java:11)

I have no idea what the error means and how can I debug it, what remained from the jflex file is a small sample so the error shouldn't be that hard to figure out

user207421
  • 305,947
  • 44
  • 307
  • 483
pAndrei
  • 383
  • 6
  • 19
  • You are throwing that exception from your own code and you don't know what it means? – user207421 Jan 11 '13 at 23:07
  • If I don't throw it I get another type of error. I'm don't have the respective code at hand at the moment but if I remember correctly I was getting "Cannot match input" error, instead. – pAndrei Jan 11 '13 at 23:09
  • You seem to have combined lexing with some of your parsing logic. Lexers need to be relatively simple, only identifying tokens, and leaving the heavy lifting to the parser. What you have looks way to complex. – Jim Garrison Jan 12 '13 at 07:09

2 Answers2

2

So you have a character appearing in your input that you don't know how to handle.

All lex files should have a final . rule that either prints an 'illegal character' error message (not a thrown exception), or else just returns yytext[0] to the parser for the parser to deal with.

The latter strategy also saves you from having to write a rule for each special character, for example =, + and so on: the parser should just use them as '=', '+', etc. Then (a) any illegal character just becomes a syntax error, but more importantly (b) the parser gets to use its error recovery, rather than just throwing the token away.

user207421
  • 305,947
  • 44
  • 307
  • 483
  • Could you please elaborate a bit? Are you suggesting that I replade the rule that causes the syntax error with one that returns yytext[0] to the parser? Thank you! – pAndrei Jan 12 '13 at 05:08
  • @pAndrei I suggested that you replace it with one of the alternatives I mentioned. – user207421 Jan 12 '13 at 06:09
0

You either do not show all the grammar or the grammar is incomplete.

Exception in thread "main" java.lang.Error: Illegal character <l>

This message tell that you don't handle loop keywords.

tcb
  • 2,745
  • 21
  • 20
  • 1
    Not really. It means that there is no rule that matches an incoming 'l'. You might be able to infer what you said from the complete grammar and the input file, but its not what this message alone means. – user207421 Jan 13 '13 at 04:59
  • I accepted EJP's answer because, while it was not the exact solution (code wise), I got the idea to add the rule : . { } at the bottom of my file. This solved my error :) – pAndrei Jan 13 '13 at 14:22
  • Oh, I am sorry for that, I have accepted it now. I apologize again for this issue, thank you for helping me. Thanks aswell for pointing out ! – pAndrei Jan 16 '13 at 16:50