3

Background Information

I have a csv file with lines that look like this:

+11231231234,13:00:00,17:00:00,1111100,12345,test.net
+11231231234,,,0000000,23456,test.net
+11231231234,18:00:00,19:00:00,1111100,09991,test.net

The lua pattern I have right now is this:

local id, start_time, end_time, asd, int, domain = line:match("(%+%d+),([%d%d:]*),([%d%d:]*),(%d*),([%d%*%#]*),(%a*.*)")

And its working

Question

How would I change this pattern so that IF the start_time / end_time values exist, I want to extract ONLY the first two sets of numbers? So for example, from this input:

+11231231234,18:00:00,19:00:00,1111100,09991,test.net

I would like to end up with these values:

start_time = 18:00
end_time = 19:00

instead of

start_time = 18:00:00
end_time = 19:00:00

What I've Tried

I've tried changing this:

line:match("(%+%d+),([%d%d:]*),([%d%d:]*),(%d*),([%d%*%#]*),(%a*.*)")

to this:

line:match("(%+%d+),([%d%d:%d%d]*),([%d%d:%d%d]*),(%d*),([%d%*%#]*),(%a*.*)")

But it was a no go

EDIT 1

I changed the pattern to this:

 line:match("(%+%d+),(%d*:?%d*)[%d:]*,(%d*:?%d*)[%d:]*,(%d*),([%d%*#]*),(%S*)")

And in some cases, its working... but in the following scenarios, it fails:

  +11231231234,00:00:00,00:00:00,1111100,12345,test.net

So when the timestamp is zero across the board, it doesn't correctly trim the seconds. I'm currently reviewing the code to make sure it's not a typo on my end. Thanks.

Happydevdays
  • 1,982
  • 5
  • 31
  • 57
  • So you want only start and end if they exist, but nothing if they dont? – Chris Tanner Oct 07 '16 at 14:47
  • yes... and if they do exist, i need to truncate / remove the last set of ":00" from each – Happydevdays Oct 07 '16 at 15:01
  • using `end` as a variable name should cause a script error. even if your interpreter would accept `end` as a variable name, which I somehow doubt it is very bad practice. – Piglet Oct 07 '16 at 15:02
  • @Piglet (hee hee, I like your handle!) You're absolutely right. I actually am using "start_time" and "end_time" in the real code... but in an effort to simplify my post here, I removed the _time part. But rest assured, it's just in the post. Sorry for the noise. I've updated the question to clarify – Happydevdays Oct 07 '16 at 15:06
  • My advice is to not try and cram everything into one Lua pattern. These are no regular expressions, so just use separate patterns to get what you want. – Wiktor Stribiżew Oct 07 '16 at 15:30

3 Answers3

2
local id, start_time, end_time, asd, int, domain = 
   line:match("(%+%d+),(%d*:?%d*)[%d:]*,(%d*:?%d*)[%d:]*,(%d*),([%d%*#]*),(%S*)")
Egor Skriptunoff
  • 23,359
  • 2
  • 34
  • 64
1

I suggest using two Lua patterns for this. Since one can't define quantified sequences in Lua patterns there is no way to do it.

So, you may use

(%+%d+),(%d+:%d+):%d+,(%d+:%d+):%d+,(%d*),([%d#]*),(%a*.*)

to get startime and endtime in the form of hh:mm if they are both present, and if the pattern does not match, use your previous one.

Also note that the bracket expressions match a single character (class), so [%d%d:] matches the same characters - digits and : - as [%d:].

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
1

Split the string using , as the delimiter, using a function, such as:

function Explode(sInput)
  local x = {}
  for w in sInput:gmatch "(.-)," do
    table.insert(x, w)
  end
  return x
end

You'll get all 5 values in the form of a table. Now, just check whether the strings at indices 2 and 3 are not empty, and parse them as per your requirements:

-- Use unpack if not using lua 5.3
 local id, start_time, end_time, asd, int, domain = table.unpack( Explode(line) )
if start_time:len() > 1 then
  start_time = start_time:match "(%d+:%d+)"
end
if end_time:len() > 1 then
  end_time = end_time:match "(%d+:%d+)"
end
hjpotter92
  • 78,589
  • 36
  • 144
  • 183