Lexer for highlighting syntax language specified by the BNF grammar?

Question

Good day.

I wanted to implement the syntax highlighting for the language using a lexer for this. The essence is simple, we find a token, frame it with a set of symbols for a particular color. But the fact is that language tokens are described in BNF, and lex / flex work with regular expressions to write tokens.

The question itself is how to build a lexer for the BNF grammar?

That's a really broad question. Entire books have been written dealing with just that topic — UnholySheep, Feb 26 '18 at 18:41
I don't know about specific tools you are asking, but BNF is describing a context-free grammar, while regular expression is describing a regular language, which is a *subset* of CFG and thus cannot cover BNF. — Eugene Sh., Feb 26 '18 at 18:43
It would help a lot if you showed the BNF for the lexical grammar. Or at least a representative sample. — rici, Feb 26 '18 at 19:05

score 1 · Accepted Answer · answered Feb 26 '18 at 19:18

BNF is a common notation for language definition, but this does not mean that all you need to do is just feed it to some compiler generator. The first thing the compiler writer does is converting the grammar to a form more suited for the scanner and the parser he uses.

Just convert your lexical token's BNF definitions to regular expressions, it is just a few hours work. Or you can write a finite automata for your BNF. I used to write lexical scanners both these ways, and I strongly suggest sticking to lex.

Lexer for highlighting syntax language specified by the BNF grammar?

1 Answers1