I'm working on a Peg parser. Among other structures, it needs to parse a tag directive. A tag can contain any character. If you want the tag to include a curly brace }
you can escape it with a backslash. If you need a literal backslash, that should also be escaped. I tried to implement this inspired by the Peg grammer for JSON: https://github.com/pegjs/pegjs/blob/master/examples/json.pegjs
There are two problems:
- an escaped backslash results in two backslash characters instead of one. Example input:
{ some characters but escape with a \\ }
- the parser breaks on an escaped curly
\}
. Example input:
{ some characters but escape \} with a \\ }
The relevant grammer is:
Tag
= "{" _ tagContent:$(TagChar+) _ "}" {
return { type: "tag", content: tagContent }
}
TagChar
= [^\}\r\n]
/ Escape
sequence:(
"\\" { return {type: "char", char: "\\"}; }
/ "}" { return {type: "char", char: "\x7d"}; }
)
{ return sequence; }
_ "whitespace"
= [ \t\n\r]*
Escape
= "\\"
You can easily test grammar and test input with the online PegJS sandbox: https://pegjs.org/online
I hope somebody has an idea to resolve this.