2

I'm trying to use pyparsing to parse simple basic program:

import pyparsing as pp

pp.ParserElement.setDefaultWhitespaceChars(" \t")

EOL = pp.LineEnd().suppress()

# Identifiers is a string + optional $
identifier = pp.Combine(pp.Word(pp.alphas) + pp.Optional("$"))

# Literals (number or double quoted string)
literal = pp.pyparsing_common.number | pp.dblQuotedString

line_number = pp.pyparsing_common.integer

function = pp.Forward()


operand = function | identifier | literal
expression = pp.infixNotation(operand, [
    (pp.oneOf("* / %"), 2, pp.opAssoc.LEFT),
    (pp.oneOf("+ -"), 2, pp.opAssoc.LEFT),
])

assignment = identifier + "=" + expression

# Keywords
PRINT = pp.CaselessKeyword("print")
FOR = pp.CaselessKeyword("for")
TO = pp.CaselessKeyword("to")
STEP = pp.CaselessKeyword("step")
NEXT = pp.CaselessKeyword("next")
CHRS = pp.CaselessKeyword("chr$")

statement = pp.Forward()

print_stmt = PRINT + pp.ZeroOrMore(expression | ";")

for_stmt = FOR + assignment + TO + expression + pp.Optional(STEP + expression)
next_stmt = NEXT

chrs_fn = CHRS + "(" + expression + ")"

function <<= chrs_fn

statement <<= print_stmt | for_stmt | next_stmt | assignment

code_line = pp.Group(line_number + statement + EOL)

program = pp.ZeroOrMore(code_line)

test = """\
10 print 123;
20 print 234; 567;
25 next
30 print 890
"""

print(program.parseString(test).dump())

I do have everything else working except the print clause.

This is the output:

[[10, 'print', 123, ';', 20, 'print', 234, ';', 567, ';', 25, 'next', 30, 'print', 890]]
[0]:
  [10, 'print', 123, ';', 20, 'print', 234, ';', 567, ';', 25, 'next', 30, 'print', 890]

Based on some suggetion I modified my parser but sitll for some reason parser leaks to next row.

How to define list of items for printing properly?

jtiai
  • 611
  • 5
  • 11
  • 1
    How do you handle newlines? Are they ignored as whitespace? – sepp2k Sep 01 '19 at 10:25
  • 1
    I've set `ParserElement.setDefaultWhitespaceChars(" \t")` but looks like `ZeroOrMore()`, when encounters new line just passes though it. If I add newline as `stopOn` that works correctly – jtiai Sep 01 '19 at 11:24
  • You entered `Keyword` but you must be using `CaselessKeyword` to parse the example text. Also, does `expr` include its trailing ';'? If so, you may want to rethink. Have you written a BNF? Try "Be The Parser" and walk through your problem statement manually following the BNF, and then following the pyparsing code. To work out the "leaks to next row" issue, you'll need to post more of the actual parser code. Interesting project though, well done. – PaulMcG Sep 01 '19 at 14:49
  • What about `Keyword("PRINT") + Optional(delimitedList(expr, delim=';')) + Optional(';')`? – PaulMcG Sep 01 '19 at 14:50
  • Do you have some optional newline for line continuations as part of your parser? This may be the culprit for leaking to the next line. – PaulMcG Sep 01 '19 at 15:40

1 Answers1

3

There is a subtle thing going on here, and it bears better documenting when using ParserElement.setDefaultWhitespaceChars. This only updates the whitespace characters for pyparsing expressions created after calling setDefaultWhitespaceChars. The built-in expressions like dblQuotedString and the expressions in pyparsing_common are all defined at import time, and so get the standard set of whitespace characters to skip, which includes '\n'. If you create new copies of them using expr.copy() or just plain expr(), you will get new expressions, which use the updated whitespace characters.

Change:

literal = pp.pyparsing_common.number | pp.dblQuotedString
line_number = pp.pyparsing_common.integer

to:

literal = pp.pyparsing_common.number() | pp.dblQuotedString()
line_number = pp.pyparsing_common.integer()

And I think your leakage issues will be resolved.

PaulMcG
  • 62,419
  • 16
  • 94
  • 130
  • This "feature" has long been in the back of my mind to rectify. I'm fixing it in the upcoming 3.0 release so that these extra copies are not required. – PaulMcG Sep 02 '19 at 10:24
  • 1
    Amazing. So simple fix... :) Thank you very much. And once finished I'll put up my project to github as OSS – jtiai Sep 02 '19 at 13:26