I have defined a hash table keyword_table
to store all the keywords of my language. Here is part of the code:
(* parser.mly *)
%token CALL CASE CLOSE CONST
...
reserved_identifier:
| CALL { "Call" }
| CASE { "Case" }
| CLOSE { "Close" }
| CONST { "Const" }
...
(* lexer.mll *)
{let hash_table list =
let tbl = Hashtbl.create (List.length list) in
List.iter (fun (s, t) -> Hashtbl.add tbl (lowercase s) t) list;
tbl
let keyword_table = hash_table [
"Call", CALL; "Case", CASE; "Close", CLOSE; "Const", CONST;
... ]}
rule token = parse
| lex_identifier as li
{ try Hashtbl.find keyword_table (lowercase li)
with Not_found -> IDENTIFIER li }
As there are a lot of keywords, I really would like to avoid as much as possible from repeat code.
In parser.mly
, it seems that %token CALL CASE ...
could not be simplified, because each token must be defined explicitly. However, for reserved_identifier
part, is there possible to call a function to return a string from a token, instead of hard coding each string?
So, that suggests probably that a hash table is not suitable for this purpose. Which data structure is best choice for a search from both sides (we assume that each key from both sides is unique)? As a result, we want to realize find_0 table "Call"
returns token CALL
(used in lexer.mll
) and find_1 table CALL
returns "Call"
(used in parser.mly
).
Also, if this table
can be defined, where should I put it so that parser.mly
can use it?