0

For example, I return yy::parser::symbol_type in flex rule via:

[a-zA-Z][a-zA-Z0-9_]*  return yy::parser::make_ID(yytext);

where ID is a token I defined in bison, it will generate the yy::parser::token structure.

Now I want to do some unit test just for flex's token.l, when I invoke the yy::parser::symbol_type yylex() function, I didn't see any API to get yy::parser::token from yy::parser::symbol_type in Bison c++ variant manual.

And by the way, in bison manual, return yy::parser::symbol_type via yy::parser::make_XXX APIs in flex rule is a recommanded.

Or is there no such API to do this job ? I need to use the symbol_type.kind() API to get something like yy::parser::symbol_type_kind::S_T_ID ?

linrongbin
  • 2,967
  • 6
  • 31
  • 59

1 Answers1

2

There is no object of type yy::parser::token. That struct exists only to hold the enum token_kind_type enumeration (not a member with that value; the enumeration itself). See the Bison manual outline of the C++ API.

I don't really understand the motivation for this, but I suppose it comes from a desire to allow ancient versions of the C++ standard which didn't have enum class.

In any case, I am almost certain that the value you want is indeed the return value of symbol_type.kind().

rici
  • 234,347
  • 28
  • 237
  • 341
  • This answer is 100% based on the bison manual. I don't use the C++ API, and haven't yet installed the most recent version of bison in order to try it. So if it doesn't work for you, please let me know (and explain how it fails to meet your needs). – rici Aug 27 '20 at 14:55
  • I kind of understand the design of bison symbol_type API now. I could use `symbol_type.value.as()` to get the integer value(which is actually the `yy::parser::token::token_kind_type`, or get string via `symbol_type.value.as()` – linrongbin Aug 28 '20 at 00:42
  • @linrongbin: I don't think so. `symbol_type.value` is the semantic value passed to the parser, not the token type. Some token kinds may have `int` values and some may have `std::string` values, but no token kind should have both, and those are not the code and name of the token kind itself. – rici Aug 28 '20 at 00:49
  • @linrongbin: For example, if the token were created with `yy::parser::make_ID (yytext, loc);` (from the example in the manual), the `token_kind_type` is `ID` (or perhaps `S_T_ID`, if you're using token prefixes); the value is copied from `yytext`, which is a `char*` so it may well be a `std::string` if that's the semantic type of `ID`. – rici Aug 28 '20 at 00:52
  • Yes, that's really strange, for now I just store `yy::parser::token::token_kind_type` as an integer into `symbol_type.value` as `int`. – linrongbin Aug 28 '20 at 01:04
  • @linrongbin: perhaps its my focus on the C API (aka "split symbols" in the C++ API) but I don't find it odd at all. The token kind is what the parser actually uses. The token value is some auxiliary information which is only of semantic interest. It might be used by attribute rules (aka semantic action) but does not influence the syntactic analysis. I.e., an identifier is just an identifier to the parser (that is, an `ID`) and you can construct the entire parse tree just knowing that. But when it comes to the semantics of the program, you have to know *which* identifier each of `ID` refers to. – rici Aug 28 '20 at 01:19
  • I suppose you are trying to extract the token kind of a token in a semantic action. Bison doesn't consider that requirement, because it's really not useful outside debugging. (And anyway, it's completely clear in each semantic action: you just have to look at the right-hand side. But that's not much use for debugging functions. Try to use bison's built-in debugging trace rather than writing your own.) – rici Aug 28 '20 at 01:25
  • I simply want to store an integer to abstract syntax tree to distinguish tokens or operators like `+` `-` `*` `/`. – linrongbin Aug 28 '20 at 01:39
  • @linrongbin: Well, each semantic action which creates an AST node using one of those operators is distinct. I guess it's irritating not being able to use exactly the same text in each of those actions, but you only have to do it once :( – rici Aug 28 '20 at 02:37