I'm parsing a language which has both <
and <<
. In my Alex definition I've got something that contains something like
tokens :-
"<" { token Lt }
"<<" { token (BinOp Shl) }
so whenever I encounter <<
, that gets tokenized as a left shift and not as to less-than's. This is generally a good thing, since I end up throwing out whitespace after tokenization and want to differentiate between 1 < < 2
and 1 << 2
. However, there are other times I wish <<
had been read as two <
. For example, I have things like
<<A>::B>
which I want read like
< < A > :: B >
Obviously I can try to adjust my Happy parser rules to accommodate for the extra cases, but that scales badly. In other imperative parser generators, I might try to do something like push back "part" of the token (something like push_back("<")
when I encountered <<
but I only needed <
).
Has anyone else had such a problem and, if so, how did you deal with it? Are there ways of "pushing back" tokens in Happy? Should I instead try to keep a whitespace token around (I'm actually leaning towards the last alternative - although being a huge headache, it would let me deal with <<
by just making sure there is no whitespace between the two <
).