0

I am using Pest.rs for parsing. I need to parse identifiers but reject them if they happen to be a reserved keyword. For example, bat is a valid identifier name but this is not since that has a specific meaning. My simplified grammar is as below.

keyword = {"this" | "function"}
identifier = {ASCII+}
valid_identifier = { !keyword ~ identifier }

This works but it also rejects identifier names like thisBat. So basically it checks if that the prefix is not a keyword, but I want to check against the full identifier.

Evan Carroll
  • 78,363
  • 46
  • 261
  • 468
AppleGrew
  • 9,302
  • 24
  • 80
  • 124

2 Answers2

0

Figured out a hack to address this.

keyword = {"this" | "function"}
identifier = {ASCII+}
valid_identifier = @{ !keyword ~ identifier | keyword ~ identifier }

The new second rule in valid_identifier takes care of matching with the valid case which the first one rejects. Note I have made valid_identifier atomic so that whitespaces are not inserted and the parse output is not like this and Bat, but a single thisBat.

AppleGrew
  • 9,302
  • 24
  • 80
  • 124
0

Supposing that identifiers are composed of alphanumeric characters, another option is:

keyword = {"this" | "function"}
identifier = @{ !(keyword ~ !ASCII_ALPHANUMERIC) ~ ASCII_ALPHANUMERIC+ }

!(keyword ~ !ASCII_ALPHANUMERIC) rejects any identifier that starts with a keyword, as long as the character following the keyword can't be part of the identifier itself. This means that thisBat is an acceptable identifier, but this is not.