2

Can anyone provide a clear explanation and some simple examples that show this error, apparently related to match-time capture (Cmt) ?

I don't understand the only mention that I can find, which is at

http://lua-users.org/lists/lua-l/2013-06/msg00086.html

Thanks

sailco12
  • 87
  • 5
  • Can you show the code that caused it? – Chris Beck Jun 18 '16 at 06:43
  • I am interested "Domain Specific Language generator for Lua" https://github.com/weshoke/DSL which runs correctly under Lua (version 5.1) and Lpeg (version 0.11-2 ). I want to update it for more modern versions (see issue at https://github.com/weshoke/DSL/issues/3 ). Soon I will try to make a github fork with my updated version. Then I will provide a link to it here so you can run it and see the error if you wish. Thanks. – sailco12 Jun 18 '16 at 15:17
  • I don't have much experience with LPEG, but a general idea in PEG is that it does not support "left recursion" and it is an error if you provide a grammar that does that. "empty loop in rule" sounds alot like left recursion to me. Here's an example I encountered using `boost::spirit` which is another PEG-based technology: http://stackoverflow.com/questions/33325243/segmentation-fault-with-trivial-spirit-parser Just a guess, HTH – Chris Beck Jun 18 '16 at 17:21
  • That's a good guess, and I think Lpeg has never accepted left recursion (see http://lua-users.org/lists/lua-l/2010-03/msg00052.html ). But the grammar has not changed, and it works correctly under the old versions of Lua and Lpeg. – sailco12 Jun 19 '16 at 02:27
  • `But the grammar has not changed, and it works correctly under the old version of Lua and Lpeg.` But that could be explained by this line from the email you linked: `Older versions of LPeg were "optimistic" in those cases, and blindly assumed that the pattern consumes something (and therefore that the loop is valid). The new version is more strict and refuses such loops.` It might be that the loop was always in the grammar but didn't trigger on any test cases you actually ran before, so you just didn't see it. Now it's stricter and won't accept the invalid input. Again, just a guess. :) – Chris Beck Jun 19 '16 at 02:29
  • BTW: You really should post your code that is causing the error. A similar post occurred on meta recently: http://meta.stackoverflow.com/questions/326267/is-it-inappropriate-to-ask-what-a-compiler-error-message-means If you just ask "what does this error message mean" without any context it's really very difficult for anyone to know for sure, or give a useful answer. Especially if you don't say what versions of `lua` and `lpeg`. – Chris Beck Jun 19 '16 at 03:53

1 Answers1

4

So this question is a bit old, but it's one of the first search results. There's not a lot on the Internet about this, and it may not be super obvious what's wrong.

The error message is a bit misleading, but what's happening - in formal PEG terms, at least as I understand them - there is a repetition operator applied to an parsing expression that can consume no input.

Or other words, LPeg has detected a loop that can match an empty string, which will never complete. There's a pure Lua implementation called LuLPeg which lacks this particular check, and if you execute your grammar it could easily enter an infinite loop.

I'm tinkering with little toy BASIC-ish language, and had the above issue with the following:

grammar = P{ "input",
  input = V"block"^0 * -1,
  block = V"expression"^0,
  -- (define expression here)
}

With that idea that the root input is an optional block of code, and a block is zero or more expressions. This is pretty simplified, of course, I'm leaving out whitespace handling and that sort of thing. But what happens when you call grammar:match("")?

  1. remaining string: "", at input. See if it matches a block.
  2. remaining string: "", at block. See if it matches an expression.
  3. remaining string: "", at expression. For the sake of time, let's say it doesn't match
  4. remaining string: "", at block. Rule concludes with zero expressions, no input consumed.
  5. remaining string: "", at input. One block consumed, check if more blocks match.
  6. remaining string: "", at block. See if it matches an expression.

And so on. Since V"block" matches the empty string, input can find an infinite number of blocks to fulfil the rule V"block"^0. In this case, there's two good solutions: Set input to at most one block, or require block to be a least one expression and wherever there could be a block set it to ^0. So either:

grammar = P{ "input", -- blocks can be empty, input contains one (empty or otherwise) block
  input = V"block" * -1,
  block = V"expression"^0,
  -- (define expression here)
}

Or:

grammar = P{ "input", -- blocks must be at least one expression, root can have one
  input = V"block"^0 * -1,
  block = V"expression"^1,
  -- (define expression here)
}

In the first case, an empty string will match a block, which fulfills the the input rule. In the second, an empty string will fail block, fulfilling the input rule with zero matching blocks.

I haven't needed to use Cmt yet, but I believe what happened was old versions of LPeg assumed the function would either fail or consume input, even if the rule inside the Cmt call could match an empty string. More recent releases don't have this assumption.

SirNuke
  • 139
  • 7