0

I am getting an exception attempting to parse the # character using Tatsu:

import tatsu

grammar = r'''
@@comments :: //
@@eol_comments :: //

start = '#' ;
'''

print(tatsu.__version__)

parser = tatsu.compile(grammar)
ast = parser.parse('#', trace=True)
5.8.3
↙start ~1:1
#
≢'#' 
≢start ~1:1
#

...

tatsu.exceptions.FailedToken: (1:1) expecting '#' :
#
^
start

If I change the # to a in both the grammar and the text then it is successful. I think the issue might be that # indicates a grammar comment in Tatsu, but I'm not sure how to escape it.

Patrick
  • 147
  • 1
  • 15

2 Answers2

0

The problem here is that config.eol_comments_re is not being overridden with the @@eol_comments definition.

Could you post an issue for this problem at https://github.com/neogeny/TatSu/issues?

The other problem is that comments should not be checked while parsing a '' token.

Apalala
  • 9,017
  • 3
  • 30
  • 48
0

I researched this problem today for a long while, and found much room for improvement in TatSu.

Yet the conclusion is that // is a valid regex that matches zero input.

The solution in your case is to set the comments to strings you don't expect to find in the input:

    grammar = r'''
        @@comments :: /@@@@@@@/
        @@eol_comments :: /@@@@@@@/

        start = '#' ;
    '''

https://github.com/neogeny/TatSu/issues/303

Apalala
  • 9,017
  • 3
  • 30
  • 48