I have hard time porting POSIX regex to Lua string patterns.
I'm dealing with html response from which I would like to filter checkboxes
that are checked. Particularly I'm interested in value
and name
fields of
each checked checkbox:
Here are examples of checkboxes I'm interested in:
<input class="rid-2 form-checkbox" id="edit-2-access-comments" name="2[access comments]" value="access comments" checked="checked" type="checkbox">
<input class="rid-3 form-checkbox real-checkbox" id="edit-3-administer-comments" name="3[administer comments]" value="administer comments" checked="checked" type="checkbox">
as opposed I'm not interested in this (unchecked checkbox):
<input class="rid-2 form-checkbox" id="edit-2-access-printer-friendly-version" name="2[access printer-friendly version]" value="access printer-friendly version" type="checkbox">
Using POSIX regex I've used following pattern in Python: pattern=r'name="(.*)" value="(.*)" checked="checked"'
and it just worked.
My first approach in Lua was simply to use this: pattern ='name="(.-)"
value="(.-)" checked="checked"'
but it gave strange results (first capture
was as expected but the second one returned lots of unneeded html).
I've also tried following pattern:
pattern = 'name="(%d?%[.-%])" value="(.-)"%s?(c?).-="?c.-"%s?type="checkbox"'
This time, in second capture content of value
was returned but all
checkboxes where matched (not only those with checked="checked"
field)
For completeness, here's the Lua code (snippet from my Nmap NSE script) that attempts to do this pattern matching:
pattern = 'name="(.-)" value="(.-)" checked="checked"'
data = {}
for name, value in string.gmatch(res.body, pattern) do
stdnse.debug(1, string.format("%s %s", name, value))
end