1

I'm trying to convert this regex pattern to lpeg: { *(?<!-)[, ](?!-) *} It's meant to be used as a splitting pattern, turning 2.5 2.6/2.5, 2.6 to 2.5 and 2.6, unless -'s are involved. It's part of identifying semver constraints.

I've used the split function from the lpeg docs, copied here

local function split (str, sep)
   sep = l.P(sep)
   local elem = l.C((1 - sep)^0)
   local p = l.Ct(elem * (sep * elem)^0)   -- make a table capture
   return p:match(str)
 end

My first stab at it works in the case there's a comma:

local h, sp = l.P'-', l.P' '
patt = sp^0 * (-h * l.S' ,' - h) * sp^0
split('2.5, 2.6', patt) -- works
split('2.5 2.6', patt) -- fails because the first sp^0 eats everything

However, it of course fails when there are only spaces. I cannot figure out how to maintain the original regex pattern's constraints and make it work in both cases. My other efforts have ended in infinite loops. Is there a way to approach capturing one instance of a pattern surrounded by greedy versions of itself, or some clever aspect of lpeg that can solve this problem?

Optimum
  • 146
  • 2
  • 11
  • Can you provide some test cases? I don't fully understand why you use lookahead and lookbehind patterns. – Brynne Taylor Aug 31 '22 at 08:52
  • @AlexanderMisel The look ahead/behind is to prevent splits on hyphen constraints. Which I forgot to declare h as D: - I'll add that. The constraints are here: https://getcomposer.org/doc/articles/versions.md#hyphenated-version-range- If you want a failing test case, the split should fail if given `2.5 - 2.6` – Optimum Aug 31 '22 at 13:19

1 Answers1

0

The default matching method in LPeg is blind greedy matching, to simulate the behavior of your given regular expression, we need to construct a grammar with non-blind greedy matching,

S <- E1 S / E2

which will match multiple E1's followed by E2

local patt = l.P{
  's',
  s  = ' ' * l.V's' + l.V'suffix',
  suffix = -l.B'-' * l.S', ' * -l.P'-'  * l.P' '^0
}