In the standard parser/lexer model, the parser knows absolutely nothing about the input mechanism. It simply transforms a stream of tokens into a parse tree. "Files" and "interactive input" are not part of a parser's data model, and you'll find it much more convenient to maintain that separation.
A Bison parser can use YYABORT
to clean up and terminate (by returning the error code 1). That's the same error return as is produced by a syntax error. It's important to use YYABORT
in order to free used resources, particularly if the parser stack includes allocated objects. So, as you say, the question resolves to how the lexer communicates the desire to terminate.
Here, the lexer's options are limited. It can return a special-purpose token, not used in any parser rule, which will trigger a syntax error. Or it could just return 0, indicating that there is no more input, which might or might not trigger a syntax error. (Of those options, I'd go for returning 0, but there's not much difference.)
If the parser is doing anything more complicated than building up an AST -- for example, if it will actually attempt to produce some product, like executable code, then you will want to include a mechanism which suppresses further processing. That could be through a global (yuk!), or shared state communicated between the parser and the lexer using Bison's additional parameter declarations. The shared state could be as simple as a boolean flag, which might need to be checked:
- in
yyerror
, in order to suppress the syntax error;
- in any parser
error
action, which should YYABORT
on premature end of input;
- in the parser's final reduction action (that is, the reduction to the start symbol), which should suppress further processing and probably call
YYABORT
;
- in whoever called the parser, in order to correctly interpret
yyparse
's error return.
So an easy solution would be to add a %param
declaration in your Bison file for a bool*
parameter, remembering to adjust the prototype for yylex
, yyerror
, and other functions which need the extra parameter.
How you actually detect the interrupt in your lexical scanner is a separate problem. Parsing a buffer's worth of input does not usually take a noticeable amount of time, so the easiest solution might be to let the interruption produce an EOF indication for the lexer, and then attempt to figure out whether the EOF was a real end of input or a user interrupt either in your <<EOF>>
action or in an implementation of yywrap
.