I have a set of InDesign documents with records in the following format -
{item_id}. {item_text} [{tags}] (options)
{item_id}. {item_text} [{tags}] (options)
{item_id}. {item_text} [{tags}] (options)
where item_id is an integer id, item_text consists of ( multi-line text block ) , tags consists of single-line text block and tags are optional in a record, i.e. they might be there or not.
So, now for selecting 1 group of items (including id, text, tags, options) I am trying the following regex:
item = '(([0-9])+\\.\\s+)(\\s|.|\\r)*?(?=[0-9]+\\.\\s)'
item_text = '[0-9]+\\.\\s+((.|\\r|\\s)*)*?(?=\\[(.)*\\])'
tags = '\\[((.)*)\\]'
here, we are extracting group 1 in item_text, tags regex for the required data.
So, now with this I am able to get the first n-1 records correctly, but the last record is not getting selected since it is not able to find the following id block for the last record i.e. this part of the regex for item - (?=[0-9]+\.\s)
Can someone suggest a better regex to capture all such records including the last one. [We are using these regexp in extendscript for InDesign scripting, so support for Positive, Negative Lookbehinds, Lookaheads is available in the application.]