6

So I have the following code to split a string between whitespaces:

text = "I am 'the text'"
for string in text:gmatch("%S+") do
    print(string)
end

The result:

I
am
'the
text'

But I need to do this:

I
am
the text --[[yep, without the quotes]]

How can I do this?

Edit: just to complement the question, the idea is to pass parameters from a program to another program. Here is the pull request that I am working, currently in review: https://github.com/mpv-player/mpv/pull/1619

m45t3r
  • 477
  • 4
  • 13

3 Answers3

7

There may be ways to do this with clever parsing, but an alternative way may be to keep track of a simple state and merge fragments based on detection of quoted fragments. Something like this may work:

local text = [[I "am" 'the text' and "some more text with '" and "escaped \" text"]]
local spat, epat, buf, quoted = [=[^(['"])]=], [=[(['"])$]=]
for str in text:gmatch("%S+") do
  local squoted = str:match(spat)
  local equoted = str:match(epat)
  local escaped = str:match([=[(\*)['"]$]=])
  if squoted and not quoted and not equoted then
    buf, quoted = str, squoted
  elseif buf and equoted == quoted and #escaped % 2 == 0 then
    str, buf, quoted = buf .. ' ' .. str, nil, nil
  elseif buf then
    buf = buf .. ' ' .. str
  end
  if not buf then print((str:gsub(spat,""):gsub(epat,""))) end
end
if buf then print("Missing matching quote for "..buf) end

This will print:

I
am
the text
and
some more text with '
and
escaped \" text

Updated to handle mixed and escaped quotes. Updated to remove quotes. Updated to handle quoted words.

Paul Kulchenko
  • 25,884
  • 3
  • 38
  • 56
  • I would prefer something using string parsing. Anyway, while I didn't said in the post I need something to work both with single and double quotes, since the idea of this code is to parse parameters from the shell. – m45t3r Feb 23 '15 at 03:08
  • It's easy to update this solution to make it work with single and double quotes; just replace `"^`"` with `[[^["']]]` and `"'$"` with `[[['"]$]]`. You may also need to check that the opening quote matches the closing one. – Paul Kulchenko Feb 23 '15 at 03:15
  • It's possible to do with with string parsing, but the solution is likely to be more complex (and not with one expression as Lua patterns are not powerful enough to express what you need). – Paul Kulchenko Feb 23 '15 at 03:16
  • @m45t3r, I updated the code to handle mixed and escaped quotes. – Paul Kulchenko Feb 23 '15 at 18:10
  • Well, we did resolve the problem in another way (using mpv's internal key-value pair representation instead of passing a string), but I quite liked your answer (since it doesn't require another library and the code is cleaner than the other non-library answer), so I am marking this as the answer. – m45t3r Feb 23 '15 at 23:38
1

Try this:

text = [[I am 'the text' and '' here is "another text in quotes" and this is the end]]

local e = 0
while true do
    local b = e+1
    b = text:find("%S",b)
    if b==nil then break end
    if text:sub(b,b)=="'" then
        e = text:find("'",b+1)
        b = b+1
    elseif text:sub(b,b)=='"' then
        e = text:find('"',b+1)
        b = b+1
    else
        e = text:find("%s",b+1)
    end
    if e==nil then e=#text+1 end
    print("["..text:sub(b,e-1).."]")
end
lhf
  • 70,581
  • 9
  • 108
  • 149
1

Lua Patterns aren't powerful to handle this task properly. Here is an LPeg solution adapted from the Lua Lexer. It handles both single and double quotes.

local lpeg = require 'lpeg'

local P, S, C, Cc, Ct = lpeg.P, lpeg.S, lpeg.C, lpeg.Cc, lpeg.Ct

local function token(id, patt) return Ct(Cc(id) * C(patt)) end

local singleq = P "'" * ((1 - S "'\r\n\f\\") + (P '\\' * 1)) ^ 0 * "'"
local doubleq = P '"' * ((1 - S '"\r\n\f\\') + (P '\\' * 1)) ^ 0 * '"'

local white = token('whitespace', S('\r\n\f\t ')^1)
local word = token('word', (1 - S("' \r\n\f\t\""))^1)

local string = token('string', singleq + doubleq)

local tokens = Ct((string + white + word) ^ 0)


input = [["This is a string" 'another string' these are words]]
for _, tok in ipairs(lpeg.match(tokens, input)) do
  if tok[1] ~= "whitespace" then
     if tok[1] == "string" then
        print(tok[2]:sub(2,-2)) -- cut off quotes
     else
       print(tok[2])
     end
  end
end

Output:

This is a string
another string
these
are
words
ryanpattison
  • 6,151
  • 1
  • 21
  • 28