Questions tagged [ply]

PLY is an implementation of lex and yacc parsing tools for Python. Please do not use this tag for the PLY graphic file format (use ply-file-format) nor for the plyr / dplyr R packages, which have their own tags.

#About PLY

PLY is a parser generator tool that uses reflection to read token definitions and production rules written in pure Python. You can, for example, define tokens with a simple string attribution or with methods containing a regular expression in its docstring.


Ply is no longer distributed in any package-installable form. Since it has no dependencies, it can be used in a project which uses Python v3.6 or later by simply copying two files into the project directory. To acquire the files, either clone github repository or just download the two files lex.py and yacc.py.

Note: Do not use pip to install PLY, it will install a broken distribution on your machine.


Examples of token definitions with code to interpret value:

def t_BOOLEAN(token):
    r'(?:true|false)'
    token.value = token.value == "true"
    return token
def t_NUMBER(token):
    r'[0-9]+'
    token.value = int(token.value)
    return token


#Related tags:

370 questions
3
votes
1 answer

PLY - return multiple tokens

AFAIK the technique for lexing Python source code is: When current line's indentation level is less than previous line's, produce DEDENT. Produce multiple DEDENTs if it is closing multiple INDENTs. When end of input is reached, produce DEDENT(s) if…
Ron
  • 7,588
  • 11
  • 38
  • 42
3
votes
2 answers

Python PLY zero or more occurrences of a parsing item

I am using Python with PLY to parse LISP-like S-Expressions and when parsing a function call there can be zero or more arguments. How can I put this into the yacc code. This is my function so far: def p_EXPR(p): '''EXPR : NUMBER |…
None
  • 3,875
  • 7
  • 43
  • 67
3
votes
3 answers

Python Lex-Yacc (PLY) Error recovery at the end of input

Problem I am trying to implement an error tolerant parser using Python Lex-Yacc (PLY), but I have trouble using error recovery rules at the end of my input string. How can I recover from an unexpected end of input? Example This example grammar…
Jen-Ya
  • 328
  • 4
  • 14
3
votes
1 answer

Use PLY to match a normal string

I am writing a parser by using PLY. The question is similar to this one How to write a regular expression to match a string literal where the escape is a doubling of the quote character?. However, I use double-quote to open and close a string. For…
Loi.Luu
  • 373
  • 3
  • 15
3
votes
2 answers

PLY yacc specifying multiline production

Is there a way to define a multi-line production with the following syntax ? PLY expects : before ID implying one production per line. def p_envvar(p): ''' envvar : EV \ ID \ COLON…
satish
  • 329
  • 2
  • 11
3
votes
1 answer

Resolving a shift/reduce conflict in an LALR parser

I've been using PLY to build up a parser for my language, however I've got a shift/reduce conflict that's causing me some trouble. My language has generic types with a syntax ala C++ templates. So right now I have rules like: expression :…
Alex Gaynor
  • 14,353
  • 9
  • 63
  • 113
3
votes
2 answers

Reporting parse errors from PLY to caller of parser

So I've implemented a parser using PLY — but all the PLY documentation deals with parse and tokenization errors by printing out error messages. I'm wondering what the best way to implement non-fatal error-reporting is, at an API level, to the caller…
gsnedders
  • 5,532
  • 2
  • 30
  • 41
3
votes
1 answer

Infinite recursion caused by multiple occurrences of a parsing item YACC-PLY

I'm dealing with a Yacc (the ply one) and I have no idea how to make more occurrences of a parsing item without making the program crashing due to infinite recursion. Let's say I have: def p_Attribute(p): ''' Attribute : STRING …
JAWE
  • 273
  • 1
  • 5
  • 13
3
votes
3 answers

several lexers for one parser with PLY?

I'm trying to implement a python parser using PLY for the Kconfig language used to generate the configuration options for the linux kernel. There's a keyword called source which performs an inclusion, so what i do is that when the lexer encounters…
LB40
  • 12,041
  • 17
  • 72
  • 107
3
votes
1 answer

Python PLY Lex ambiguity

I have a problem with ambiguity on tokens level. The problem looks like this. My code looks like this so token t_UN1 has higher precedence. t_ignore = ' \t\v\r' # whitespace .... def t_UN1(t): #NS_ r'NS\_' return t def t_IDENTIFIER(t): …
3
votes
1 answer

How can I create a ply rule for recognizing CRs?

I have trouble with distinguishing between \r (0x0d) and \n (0x0a) in my PLY lexer. A minimal example is the following program import ply.lex as lex # token names tokens = ('CR', 'LF') # token regexes t_CR = r'\r' t_LF = r'\n' # chars to…
esg
  • 33
  • 4
3
votes
3 answers

RegEx with variable data in it - ply.lex

im using the python module ply.lex to write a lexer. I got some of my tokens specified with regular expression but now im stuck. I've a list of Keywords who should be a token. data is a list with about 1000 Keywords which should be all recognised as…
Sean M.
  • 595
  • 1
  • 4
  • 21
3
votes
3 answers

PLY: Token shifting problem in C parser

I'm writing a C parser using PLY, and recently ran into a problem. This code: typedef int my_type; my_type x; Is correct C code, because my_type is defined as a type previously to being used as such. I handle it by filling a type symbol table in…
Eli Bendersky
  • 263,248
  • 89
  • 350
  • 412
2
votes
1 answer

How to write a PLY interface for hand-written lexer?

I'm writing a compiler in Python, and I made a hand-written lexer, because I can't figure out how to parse indentation in PLY. Also, my lexer uses some yield statements like so: def scan(): ... for i in tokens: if i[0]: yield…
Sammi De Guzman
  • 567
  • 9
  • 20
2
votes
2 answers

Lexing sum operator and a signed integer with PLY Python

How can I build my raw expression to differentiate between a sum operator and a signed integer? I'm using PLY Python. This,unfortunately, didn't work: t_sum=r'\+' def t_integer(token): r'[-+]?\d+'
Academia
  • 3,984
  • 6
  • 32
  • 49