0

I am doing a High Level Language, and I cannot understand why I get this error when trying to generate a lexer for it.

This is my code:

start : codigo; 

codigo : funcion* home;

funcion : DEF ID PL (ID (',' ID)*)* PR LLL expresion LLR;

home : HOME PL PR LLL expresion+ graph LLR;

@expresionuno:
    variable
    | numero 
    | expuno 
    | expdos
    | funcioncall
    | parexpdos     ;

@expresiondos:
    condicional 
    | parexp    ;

@expresion:
    expresionuno
    | expresiondos  ;

expresionlogica:
    expresionuno (EQ | NE | LT | LE | GT | GE) expresionuno 
    | TRUE 
    | FALSE     ;

condicional:
    IF PL expresionlogica PR LLL expresion LLR (ELSE LLL expresion LLR)?;

variable:
    ID;

numero: 
    '(\+|-)?[0-9]+(\.[0-9]+)?';

parexp:
    PL expresion PR;

parexpdos:
    (ADD | SUB)? PL expresionuno PR;

expuno:
    expresionuno (ADD | SUB) expresionuno;

expdos:
    expresionuno (MUL | DIV | MOD | POW) expresionuno;

funcioncall:
    ID PL (expresion (',' expresion)*)* PR;

graph : ID PL funcioncall numero numero PR;

/**  * Lexer rules  *  * Here we define the tokens identified by the
lexer.  */

// COMENTARIOS ABRIR_COMENTARIO : '/\*'; CERRAR_COMENTARIO : '\*/';
COMENTARIO : ABRIR_COMENTARIO '.*?' CERRAR_COMENTARIO (%ignore);

// SIMBOLOS ARITMETICOS ADD     : '\+'; SUB     : '-'; MUL     : '\*';
DIV     : '/'; MOD     : '%'; POW     : '^';

// SIMBOLOS LOGICOS EQ      : '=='; NE      : '!='; LT      : '<'; LE 
: '<='; GT      : '>'; GE      : '>=';

// TIPOS DE PARENTESIS PL      : '\('; PR      : '\)'; LLL     : '{';
LLR     : '}';

// NOMBRES ID ID: '[a-z]+'
    (%unless
        DEF: 'func';
        HOME    : 'home';
        IF      : 'if';
        ELSE    : 'else';
        TRUE    : 'true';
        FALSE   : 'false';
    );

WS : '[ \t\r\n]+' (%ignore);

And this is the output:

ERROR: Regular expression for rule 't_POW' matches empty string
Traceback (most recent call last):
  File ".\CompilatorMain.py", line 19, in <module>
    arbol = Grammar(gramatica, auto_filter_tokens=False).parse(inputCode)
  File "C:\ProgramData\Anaconda3\lib\site-packages\plyplus\plyplus.py", line 560, in __init__
    self._grammar = self._create_grammar(grammar, source, tab_filename, options)
  File "C:\ProgramData\Anaconda3\lib\site-packages\plyplus\plyplus.py", line 569, in _create_grammar
    return _Grammar(grammar_tree, source, tab_filename, options)
  File "C:\ProgramData\Anaconda3\lib\site-packages\plyplus\plyplus.py", line 674, in __init__
    self.engine.build_lexer()
  File "C:\ProgramData\Anaconda3\lib\site-packages\plyplus\engine_ply.py", line 94, in build_lexer
    self.lexer = lex.lex(module=self.callback, reflags=re.UNICODE)
  File "C:\ProgramData\Anaconda3\lib\site-packages\ply\lex.py", line 909, in lex
    raise SyntaxError("Can't build lexer")
SyntaxError: Can't build lexer
mkrieger1
  • 19,194
  • 5
  • 54
  • 65
  • 1
    Perhaps `POW: '\^'`? I think you were just matching start of line. – tdelaney May 31 '20 at 22:25
  • The error says that the regular expression for POW matches the empty string (which would mean that all programs are an infinite number of POW operations) and indeed your POW regular expression matches the empty string. Its not a stretch to think that changing that regex to match the hat character would fix the problem. I'm not sure why one would post here and then just ignore comments. – tdelaney May 31 '20 at 22:48

0 Answers0