1

I am just getting into writing a DSL and would like to use JISON (http://zaach.github.io/jison). I am trying to learn the grammar syntax and am running into a problem with specifying a string of characters in double quotes.

What I would think would work is:

%lex
%%

[\n\s]+                 /* skip whitespace */
"true"|"false"          return 'BOOL'
"IF"                    return 'START'
"AND"|"OR"              return 'LOGIC'
<<EOF>>                 return 'EOF'
.                       return 'INVALID'

/lex

%start string
%%

string
    : '"' [^"]+ '"'
        {$$ = $2;}
    ;

... or perhaps:

%lex
%%

[\n\s]+                 /* skip whitespace */
"true"|"false"          return 'BOOL'
"IF"                    return 'START'
"AND"|"OR"              return 'LOGIC'
\"[^"]+\"               return 'STRING'
<<EOF>>                 return 'EOF'
.                       return 'INVALID'

/lex

%start string
%%

string
    : STRING
        {$$ = $1;}
    ;

This first (basically) doesn't work at all, while the second one kinda works; when it finds a string the value coming out includes the escaped double-quotes.

Is there a good resource that helps with learning JISON/BISON/BNF grammar definitions? I have been looking around but haven't been able to find anything that helps me; not a comp/sci major. Am I just missing something simple or something more substantial?

For some context:

I am trying to define a simple DSL for parsing simple conditions:

IF Something > 100
AND Another == true
    doAction 2.51
kalisjoshua
  • 2,306
  • 2
  • 26
  • 37

1 Answers1

2

You probably just need to trim the quotes:

\"[^"]+\"         yytext = yytext.slice(1,-1); return 'STRING'

Aside from toy languages, strings are usually a lot more complicated than just a sequence of characters surrounded by quotes. You normally at least have to deal with some form of escaping special characters:

"A \t tab and a newline \n embedded in a \"string\"."

Or SQL/CVS style quote escaping:

"Embedded ""quoted string"" in a quoted string."

And you might even want to do Perl/Bash style variable substitution.

"This gets really complicated: $ButSomePeopleLikeIt"

So reprocessing the string is quite common, and not just to remove the delimiters. This can be done one character (sequence) at a time with start conditions, or in a separate post-processing operation.

rici
  • 234,347
  • 28
  • 237
  • 341
  • Thank you. I was attempting something of the sort with `yytext.replace(/\\"/g, '')` and that didn't initially work, but is this or what you suggested a "proper" way of doing this bison/jison? I felt a little "hacky" doing this, but if it just how it is done then I am fine. – kalisjoshua Sep 19 '14 at 11:12
  • @kalisjoshua I don't know of any other way of doing it; that's certainly how I would do it with `flex` (although in `flex` you have to copy `yytext` anyway, so you might feel that it is less "hacky" :) ). – rici Sep 19 '14 at 16:34