1

I need to iterate over some pairs of strings in a program that I am writing. Instead of putting the string pairs in a big table-of-tables, I am putting them all in a single string, because I think the end result is easier to read:

function two_column_data(data)
  return data:gmatch('%s*([^%s]+)%s+([^%s]+)%s*\n')
end

for a, b in two_column_data [[
  Hello  world
  Olá    hugomg
]] do
  print( a .. ", " .. b .. "!")
end

The output is what you would expect:

Hello, world!
Olá, hugomg!

However, as the name indicates, the two_column_data function only works if there are two exactly columns of data. How can I make it so it works on any number of columns?

for x in any_column_data [[
  qwe
  asd
]] do
  print(x)
end

for x,y,z in any_column_data [[
  qwe rty uio
  asd dfg hjk
]] do
  print(x,y,z)
end

I'm OK with using lpeg for this task if its necessary.

hugomg
  • 68,213
  • 24
  • 160
  • 246

4 Answers4

2
function any_column_data(data)
  local f = data:gmatch'%S[^\r\n]+'
  return
    function()
      local line = f()
      if line then
        local row, ctr = line:gsub('%s*(%S+)','%1 ')
        return row:match(('(.-) '):rep(ctr))
      end
    end
end
Egor Skriptunoff
  • 23,359
  • 2
  • 34
  • 64
1
local function any_column_data( str )
    local pos = 0
    return function()
        local _, to, line = str:find("([^\n]+)\n", pos)
        if line then
            pos = to
            local words = {}
            line:gsub("[^%s]+", function( word )
                table.insert(words, word)
            end)
            return table.unpack(words)
        end
    end
end
marsgpl
  • 552
  • 2
  • 12
  • Grumbling? Don't you want this person to understand what's in his code? – warspyking Dec 23 '15 at 23:53
  • @warspyking, I don't care about him, just wanted to try to write the algo ) Seems like Egor Skriptunoff made it shorter and clearer. – marsgpl Dec 24 '15 at 09:15
  • Looks like Louis feels the same way. There's no point in giving someone code unless they understand why it works, you'll just get more questions with same problem due to lack if sufficient information in his previous questions. – warspyking Dec 24 '15 at 13:37
1

Outer loop returns lines, and inner loop returns words in line.

s = [[
  qwe rty uio
  asd dfg hjk
]]

for s in s:gmatch('(.-)\n') do
  for s in s:gmatch('%w+') do
    io.write(s,' ')
  end
  io.write('\n')
end
tonypdmtr
  • 3,037
  • 2
  • 17
  • 29
  • A code block alone does not provide a good answer. Please add explanations. – Louis Barranqueiro Dec 24 '15 at 08:55
  • Unless the code (as in this case) is self evident to even a beginner programmer. Otherwise, I would have to teach Computer Science with each answer I give. :) – tonypdmtr Dec 24 '15 at 11:56
  • maybe to you and the asker. But keep in mind, that you are not alone. Just explain why your answer solve the issue, where was the mistake, what do you use to solve the issue, etc... I recommend you to read http://stackoverflow.com/help/how-to-answer carefully. – Louis Barranqueiro Dec 24 '15 at 12:00
1

Here is an lpeg re version

function re_column_data(subj)
    local t, i = re.compile([[
          record <- {| ({| [ %t]* field ([ %t]+ field)* |} (%nl / !.))* |}
          field <- escaped / nonescaped
          nonescaped <- { [^ %t"%nl]+ }
          escaped <- '"' {~ ([^"] / '""' -> '"')* ~} '"']], { t = '\t' }):match(subj)
    return function()
        local ret 
        i, ret = next(t, i)
        if i then
            return unpack(ret)
        end
    end
end

It basicly is a redo of the CSV sample and supports quoted fields for some nice use-cases: values with spaces, empty values (""), multi-line values, etc.

for a, b, c in re_column_data([[
    Hello  world "test
test"
    Olá    "hug omg"
""]].."\tempty a") do
    print( a .. ", " .. b .. "! " .. (c or ''))
end
wqw
  • 11,771
  • 1
  • 33
  • 41