0

I am been using treetop for a while. I wrote rules following

http://thingsaaronmade.com/blog/a-quick-intro-to-writing-a-parser-using-treetop.html

I can parse my whole input string but i none of the other to_array function gets triggered other than the initial one.

Then I found https://whitequark.org/blog/2011/09/08/treetop-typical-errors/ which talk about AST squashing and I figured out that my rule is doing the same.

The first rule I have is

  rule bodies
    blankLine* interesting:(body+) blankLine* <Bodies>
  end

and everything is getting gobbled up by body.

Can someone suggest me what can I do to fix this?

Edit Adding code snippet:

grammar Sexp

  rule bodies
    blankLine* interesting:(body+) blankLine* <Bodies>
  end

  rule body
    commentPortString (ifdef_blocks / interface)+ (blankLine / end_of_file) <Body>
  end

  rule interface
    space? (intf / intfWithSize) space?  newLine <Interface>
  end

  rule commentPortString
    space? '//' space portString space?  <CommentPortString>
  end

  rule portString
    'Port' space? '.' newLine <PortString>
  end

  rule expression
    space? '(' body ')' space? <Expression>
  end

  rule intf
    (input / output) space wire:wireName space? ';' <Intf>
  end

  rule intfWithSize
    (input / output) space? width:ifWidth space? wire:wireName space? ';' <IntfWithSize>
  end

  rule input
    'input' <InputDir>
  end

  rule output
    'output' <OutputDir>
  end

  rule ifdef_blocks
    ifdef_line (interface / ifdef_block)* endif_line <IfdefBlocks>
  end

  rule ifdef_block
    ifdef_line interface* endif_line <IfdefBlocks>
  end

  rule ifdef_line
    space? (ifdef / ifndef) space+  allCaps space? newLine <IfdefLine>
  end

  rule endif_line
    space? (endif) space? newLine <EndifLine>
  end

  rule ifdef
    '`ifdef' <Ifdef>
  end

  rule ifndef
    '`ifndef' <Ifndef>
  end

  rule endif
    '`endif' <Endif>
  end

  rule ifWidth
    '[' space? msb:digits space? ':' space? lsb:digits ']' <IfWidth>
  end

  rule digits
    [0-9]+ <Digits>
  end

  rule integer
    ('+' / '-')? [0-9]+ <IntegerLiteral>
  end

  rule float
    ('+' / '-')? [0-9]+ (('.' [0-9]+) / ('e' [0-9]+)) <FloatLiteral>
  end

  rule string
    '"' ('\"' / !'"' .)* '"' <StringLiteral>
  end

  rule identifier
    [a-zA-Z\=\*] [a-zA-Z0-9_\=\*]* <Identifier>
  end

  rule allCaps
    [A-Z] [A-Z0-9_]*
  end

  rule wireName
    [a-zA-Z] [a-zA-Z0-9_]* <WireName>
  end

  rule non_space
    !space .
  end

  rule space
    [^\S\n]+
  end

  rule non_space
    !space .
  end

  rule blankLine
    space* newLine
  end

  rule not_newLine
    !newLine .
  end

  rule newLine
    [\n\r]
  end

  rule end_of_file
    !.
  end

end

Test string

// Port.
input         CLK;

// Port.
input         REFCLK;

// Port.
input [ 41:0] mem_power_ctrl;
output data;

EDIT: Adding more details

The test code is checked into: https://github.com/justrajdeep/treetop_ruby_issue.

As you would see in my node_extensions.rb all the nodes except the Bodies raise an exception in the method to_array. But none of the exceptions trigger.

justrajdeep
  • 855
  • 3
  • 12
  • 29
  • There are 10 watchers on the Treetop tag, and 77.5k on Ruby. I think you would have much better luck getting an answer if you would provide a MCVE: https://stackoverflow.com/help/mcve – Casper Aug 01 '18 at 18:46
  • Thanks, adding code snippet. – justrajdeep Aug 01 '18 at 19:05
  • It would be helpful if your MCVE focussed a bit more on the "minimal" part. You should also describe what problem you're encountering (not just the expected cause). You've mentioned (somewhat misleading) blog post about "AST squashing", but that concept wouldn't apply to anything in your grammar - certainly not your `bodies` rule. Why do you think that your problem is related to the phenomenon described in the blog? What unexpected behaviour are you observing and how which behaviour would you expect instead? – sepp2k Aug 02 '18 at 19:02
  • @sepp2k- Thanks. Added details. – justrajdeep Aug 02 '18 at 19:28
  • @justrajdeep "all the nodes except the Body raise an exception" That's not true. `Body` does raise an exception, `Bodies` is the one that doesn't throw an exception. With that in mind: Where in your call do you call `to_array` on anything other than `Bodies`? – sepp2k Aug 02 '18 at 19:50
  • @sepp2k - I have checked in a smaller code example. Sorry I meant `Bodies`. I call `to_array` from the `test.rb`. My expectation is when any of the nodes match, the corresponding node_extensions's `to_array` would get called. In my example only the `Bodies` to_array ever gets called. None of the other gets called. – justrajdeep Aug 02 '18 at 20:43

2 Answers2

1

You call to_array on tree, which is a Bodies. That is the only thing you ever call to_array on, so no other to_array method will be called.

If you want to_array to be called on child nodes of the Bodies node, Bodies#to_array needs to call to_array on those child nodes. So if you want to call it on the Body nodes you labelled interesting, you should iterate over interesting and call .to_array on each element.

sepp2k
  • 363,768
  • 54
  • 674
  • 675
0

Try breaking (body+) into a new rule like this:

rule bodies
   blankLine* interesting:interesting blankLine* <Bodies>
end

rule interesting
   body+ <Interesting>
end

Otherwise, it would be helpful to see the SyntaxNode classes.

Josh Voigts
  • 4,114
  • 1
  • 18
  • 43