Questions tagged [tatsu]

Use the [tatsu] tag for all questions related to the TatSu or Grako parser generators.

TatSu (the successor to Grako) is a tool that takes grammars in a variation of EBNF as input, and outputs memoizing (Packrat) PEG parsers in Python.

TatSu can also compile a grammar stored in a string into a tatsu.grammars.Grammar object that can be used to parse any given input, much like the re module does with regular expressions.

46 questions
1
vote
2 answers

TatSu: How to optimize the following grammar logic for faster parse time?

I have the following grammar in TatSu. To reduce parse time, I implemented cut operations (i.e., commit to a particular rule option once a particular token is seen). However, I still see long runtimes. On a file with about 830K lines, it takes…
user4979733
  • 3,181
  • 4
  • 26
  • 41
1
vote
1 answer

Is there a way to do context sensitive parsing in tatsu

context sensitive '%' ..... eol comments I'm starting with the grammar for PDF described here https://github.com/caradoc-org/caradoc/blob/master/doc/grammar/grammar.pdf which seems to lack the definition of eol comments. PDF has end of line…
1
vote
1 answer

How to include a literal '#' in a Tatsu grammar?

I can't get Tatsu to parse a grammar that includes a literal '#'. Here is a minimal example: G = r''' atom = /[0-9]+/ | '#' atom ; ''' p = tatsu.compile(G) p.parse('#345', trace=True) The parse throws a FailedParse exception. The trace…
RootTwo
  • 4,288
  • 1
  • 11
  • 15
1
vote
2 answers

Alphabetic characters not recognized in tatsu parse

I have defined a very simple grammar, but tatsu does not behave as expected. I have added a "start" rule and terminated it with a "$" character, but I still see the same behavior. If I define the "fingering" rule with a regular expression (digit =…
1
vote
1 answer

Tatsu Parsing Performance

I've implemented a grammar in Tatsu for parsing a description of a quantum program Quipper ASCII (link). The parser works but is slow for the files I'm looking at (about 10kB-1MB size, see the resources directory). It takes approximately 10-30…
1
vote
1 answer

How to get concise syntax error messages from grako/TatSu

If the input to a grako/tatsu generated parser has a syntax error, such as 3 + / 3 to the calc.py examples, one gets a long list of Python calling sequences in addition to the relevant 3 + / 3 ^ I could use try - except constructions but then…
koskenni
  • 63
  • 5
1
vote
2 answers

Is it possible to use a different lexer?

I would like to use a different lexer for tatsu, yet use tatsu's parser. Is this possible? For example, in the grammar: expr = NUM | ID | (expr '+' expr) ; is it possible to use an alternative lexer to provide NUM and ID?
1
vote
1 answer

Cannot define rule priority in grako grammar for handling special tokens

I am trying to analyze some documents by a grammar generated via Grako that should parse simple sentences for further analysis but face some difficulties with some special tokens. The (Grako-style) EBNF looks like: abbr::str = "etc." |…
0
votes
1 answer

How to use #include in TatSu grammar files?

The #include pragma with relative path does not work. With a grammar file containing ... #include :: "secondary.ebnf" and code to compile it with open("/full/path/to/main.ebnf") as source: psr = tatsu.compile(source.read()) I'm getting…
volferine
  • 372
  • 1
  • 9
0
votes
2 answers

Matching the hash character in Tatsu

I am getting an exception attempting to parse the # character using Tatsu: import tatsu grammar = r''' @@comments :: // @@eol_comments :: // start = '#' ; ''' print(tatsu.__version__) parser = tatsu.compile(grammar) ast = parser.parse('#',…
Patrick
  • 147
  • 1
  • 15
0
votes
0 answers

Unexpected output from TatSu parser

The below TatSu grammar (TatSu 5.8.3, Python 3.11) creates an unexpected output from the given input: I expected a nested xxx yy, but the brackets [] are completely ignorded: @@grammar :: Test @@whitespace :: /[\t ]+/ start = script ; script =…
Painter
  • 1
  • 1
0
votes
1 answer

Tatsu Parser, unclear why it isn't moving to the next rule in the line?

I am writing a code parser/formatter for a language that doesn't have one, OSTW (Overwatch higher level language for workshop code). So that I can be lazy and have pretty code. I am pretty new to this idea, so if tatsu is a poor choice for this…
Mriswithe
  • 1
  • 1
0
votes
1 answer

tatsu.exceptions.FailedParse while using a C BNF grammar adapted to Tatsu

tatsu.exceptions.FailedParse: (52:24) expecting one of: "'" '"' : declarator = {pointer}? direct_declarator ; ^ I found a C BNF grammar here:…
jokoon
  • 6,207
  • 11
  • 48
  • 85
0
votes
1 answer

Parsing unique but unordered named blocks

I have a DSL where a file consists of multiple named blocks. Ideally, each block should occur only once, but the order doesn't matter. How do I write a parser that ignores block order, but gives syntax errors if the same block is repeated?
shader
  • 801
  • 1
  • 7
  • 25
0
votes
1 answer

Is there a Tatsu or any PEG-format grammar available for the [g]awk language syntax?

As the subject asks, does anyone know of an existing Tatsu grammar (or at least a PEG-format grammar) for the [g]awk language? I did already browse all existing Tatsu examples that I could find, and searched extensively around the net for any…
pjfarley3
  • 29
  • 2