2

I'm making a serialization library for Lua, and I'm using LPeg to parse the string. I've got K/V pairs working (with the key explicitly named), but now I'm going to add auto-indexing.

It'll work like so:

@"value"
@"value2"

Will evaluate to

{
  [1] = "value"
  [2] = "value2"
}

I've already got the value matching working (strings, tables, numbers, and Booleans all work perfectly), so I don't need help with that; what I'm looking for is the indexing. For each match of @[value pattern], it should capture the number of @[value pattern]'s found - in other words, I can match a sequence of values ("@"value1" @"value2") but I don't know how to assign them indexes according to the number of matches. If that's not clear enough, just comment and I'll attempt to explain it better.

Here's something of what my current pattern looks like (using compressed notation):

local process = {} -- Process a captured value
  process.number = tonumber
  process.string = function(s) return s:sub(2, -2) end -- Strip of opening and closing tags
  process.boolean = function(s) if s == "true" then return true else return false end

number = [decimal number, scientific notation] / process.number
string = [double or single quoted string, supports escaped quotation characters] / process.string
boolean = P("true") + "false" / process.boolean
table = [balanced brackets] / [parse the table]

type = number + string + boolean + table

at_notation = (P("@") * whitespace * type) / [creates a table that includes the key and value]

As you can see in the last line of code, I've got a function that does this:

k,v matched in the pattern
-- turns into --
{k, v}
-- which is then added into an "entry table" (I loop through it and add it into the return table)
Caleb P
  • 341
  • 1
  • 2
  • 15
  • Well, not an LPeg specific idea, but can't you just loop over matches, and use `t[#t+1]=matched_value` ? This would even work with simple Lua pattern matching. – jpjacobs Oct 23 '13 at 12:47
  • As an example of the previous comment: `p=function(s)local t={} for m in s:gmatch('@([^@]*)') do t[#t+1]=loadstring('return '..m)()end return t end`. Might need some scrubbing for proper sandboxing and 5.2 compatibility, but you get the point. As example usage: `p('@"blah" @1 @print @function(x)return x*2 end')` anything not @ is value. – jpjacobs Oct 23 '13 at 12:53
  • It's not clear what you are looking for. Doesn't the index above already indicate the number of captures? – greatwolf Oct 23 '13 at 14:51
  • Thanks for the advice :) Do you know of a way to do this explicitly in LPeg? It would be easiest with `Ct([matches])`, count them, and insert them, but faster and "prettier" to do it from within LPeg. – Caleb P Oct 23 '13 at 14:52
  • Sorry, didn't see @greatwolf's post. – Caleb P Oct 23 '13 at 14:52
  • @greatwolf: Yes, it does, but I don't have it working like that yet. – Caleb P Oct 23 '13 at 14:53
  • Can you add what your current grammar looks like? Is the format you're parsing always in the form `@"value"\n`? – greatwolf Oct 23 '13 at 14:57
  • My data will be in any form `@[whitespace?]"value"[whitespace]@"value2"`. I edited my original post with more info :) – Caleb P Oct 23 '13 at 15:30
  • What would the "key" part of this match be? It can't be `@` because every match would keep reusing it and replacing the old result in the capture table. If the current produced table isn't of the desired indexed form, what does it currently look like? – greatwolf Oct 23 '13 at 22:01
  • What I mean by "key" is: I already have it capturing explicit k,v pairs as key = value. Each pair is inserted into the table as {key, value}, so if indexing could create the same sort of index {index, value} that would be the same thing. – Caleb P Oct 23 '13 at 22:07

1 Answers1

3

Based on what you've described so far, you should be able to accomplish this using a simple capture and table capture.

Here's a simplified example I knocked up to illustrate:

lpeg = require 'lpeg'
l = lpeg.locale(lpeg)


whitesp = l.space ^ 0
bool_val    = (l.P "true" + "false") / function (s) return s == "true" end
num_val     = l.digit ^ 1 / tonumber
string_val  = '"' * l.C(l.alnum ^ 1) * '"'
val = bool_val + num_val + string_val
at_notation = l.Ct( (l.P "@" * whitesp * val * whitesp) ^ 0 )

local testdata = [[
@"value1"
  @42
@  "value2"
@true
]]

local res = l.match(at_notation, testdata)

The match returns a table containing the contents:

{
  [1] = "value1",
  [2] = 42,
  [3] = "value2",
  [4] = true
}
cyclaminist
  • 1,697
  • 1
  • 6
  • 12
greatwolf
  • 20,287
  • 13
  • 71
  • 105