-1

I am trying to write a lexical analyzer for C# language, but I can't figure out how can I differentiate the plus sign from the plus operator, except the context. I need the next token from the source file. So, when I encounter a + how do I now it refers to a declaration of some integer, real, whatever or it refers to + operator? How can my scannig function differentiate these two situations appropriately? The case is similar to this < and <=, <<, but in my situation next character does't help every time.

int a = +1;
a=2 + 3;
Radu Mardari
  • 131
  • 3
  • 6

1 Answers1

1

I am trying to write a lexical analyzer for C# language

OK, but you misplaced the your lexer/parser separation bar here.

The lexer's job is to "cut" the input string into tokens. The parser's job is to interpret these. Your lexer should just detect the + operator, emit the corresponding token, and that's it.

Then, your parser, which has context knowledge (ie it knows which part of an expression it is trying to parse at a given moment) is in a much better position to make the difference between an unary and a binary operator. The lexer simply lacks the necessary information.

Obviously, you shouldn't include the - sign either into number tokens.

Here are some lexing examples:

int a=+1; --> int a = + 1 ;

a=2+3; --> a = 2 + 3 ;

Note the + 1 in the first case. Your lexer shouldn't emit +1.

Lucas Trzesniewski
  • 50,214
  • 11
  • 107
  • 158
  • Thank you! Now I figure out how to solve this. Have a nice day. – Radu Mardari Nov 14 '14 at 18:06
  • But now I face a new problem. This situation is the same as this; – Radu Mardari Nov 15 '14 at 07:06
  • But I face a new problem. How do I deal with this situation: < and <=. I read that lexical analyzer is supposed to recognize them and not the parser. It is fair to treat plus specialy? – Radu Mardari Nov 15 '14 at 07:14
  • @user2991856 The lexer should emit `<=` as a single token in this case. If you encounter `<` immediately followed by `=` then emit `<=`, if it's followed by something else, then emit a `<`. Templates will be the tough part for the `>` though (recognize the `>>` operator vs `>` `>` in `Func>` for instance) – Lucas Trzesniewski Nov 15 '14 at 10:53