Why is buffering used in lexical analysis?and what is best value for EOF?
-
What have you discovered for yourself so far? – M.P. Korstanje Feb 11 '15 at 12:28
3 Answers
EOF is typically defined as (-1).
In my time I have made quite a number of parsers using lex/yacc, flex/bison and even a hand-written lexical analyser and a LL(1) parser. 'Buffering' is rather vague and could mean multiple things (input characters or output tokens) but I can imagine that the lexical analyzer has an input buffer where it can look ahead. When analyzing 'for (foo=0;foo<10;foo++)', the token for the keyword 'for' is produced once the space following it is seen. The token for the first identifier 'foo' is produced once it sees the character '='. It will want to pass the name of the identifier to the parser and therefore needs a buffer so the word 'foo' is still somewhere in memory when the token is produced.

- 191
- 2
- 11
Speed of lexical analysis is a concern. Also, need to check several ahead characters in order to find a match.

- 31
- 3
Lexical analyzer scans a input string character by character,from left to right and those input character thus read from hard-disk or secondary storage.That can requires a lot of system calls according to the size of program and can make the system slow.That's why we use input buffering technique. Input buffer is a location that holds all the incoming information before it continues to CPU for processing. you can also know more information from here: https://www.geeksforgeeks.org/input-buffering-in-compiler-design/