Recognizing and ignoring comments is best done in the lexical scanner, using a rule like:
REM([[:blank:]].*)?$ ;
That's not the same as "Where randominstruction has some random instructions in C", which is a bit misleading. The comment ignores the line of text, without regard to whether it consists of valid language tokens are not.
The complexity of the regular expression above is due to the need for REM
to not be part of a longer word. Insisting that it is followed by whitespace might be too strict (is REM(....)
a valid comment?), so it is possible that a better one would be:
REM([^[:alnum:]_].*)?$ ;
The intent of ?$
is to also accept REM
if it is the only thing on the line. It may well have been clearer to use two patterns:
REM[^[:alnum:]_].* ;
REM$ ;