Getting next possible tokens with lark parsing

Question

I wanted to know if there is a way to get the next possible token of a given string and a given grammar with lark parsing.

For example if I have the grammar.

?start: NAME "=" possible_values
possible_values: "apple" | "banana" | "orange"

and I input the string "my_variable ="

The next possible tokens would be "apple", "banana" or "orange".

Is there any built in funcionality that can help me achieve this?

score 1 · Accepted Answer · answered Jun 20 '21 at 05:41

Actually yes, added in the newest release (0.11.3).

It is called InteractiveParser/Lark.parse_interactive it currently only works with parser='lalr' and the Interface might change till version 1.0 .

It can be used like this:

from lark import Lark

parser = Lark(r"""
?start: NAME "=" possible_values
possible_values: "apple" | "banana" | "orange"
NAME: /\w+/
%ignore /\s+/
""", parser="lalr")

interactive = parser.parse_interactive("my_variable = ")

# feeds the text given to above into the parsers. This is not done automatically.
interactive.exhaust_lexer()


# returns the names of the Terminals that are currently accepted.
print(interactive.accepts())

Note that accepts returns a list of names of terminals, which are mostly helpful, but might be auto generated and less then helpful (Something like __ANON_0). The actually definitions are accessible via parser.terminals, which is a list from which you have to extract the correct definition:

term_name = "BANANA"

term_def = next(t for t in parser.terminals if t.name==term_name)

print(term_def.name)
print(term_def.pattern)

Here are the docs on InteractiveParser

_{(For faster answers in the future, post a link to your SO question on gitter)}

Getting next possible tokens with lark parsing

1 Answers1