-1

I'm looking for a regular expression that checks for any html tag (assume any string of a-z longer than 1 letter is valid), with any number of attributes so long as one of which is an action="POST".

i.e the following would match:
<a href="www.somelink.com" action="POST" /> <img action="POST" src="www.someimage.com" ></img> BUT this would not
<a href="www.somelink.com" />

I have been working on this and came to the below,

^<([a-z]+)([^<]*)*action="POST"(?:>(.*)<\/\1>|\s+\/>)$

however it is not matching (and crashing some reg ex checkers). Any thoughts or pushes in the right direction? `

Jacob
  • 160
  • 1
  • 7
  • 1
    http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – Gilles Quénot Dec 23 '14 at 13:23
  • @sputnick That's mostly unrelated, though still carries a good message. – Nic Dec 23 '14 at 13:32
  • 1
    It's related: don't parse HTML with REGEX ! – Gilles Quénot Dec 23 '14 at 13:41
  • I know its a big no no, I'm not actually using it in a project, its more to prove a concept to a question posed by a friend... that I'm admittedly having difficulty with! :) thanks everyone! – Jacob Dec 23 '14 at 13:59
  • 1.) Remove [anchors](http://www.regular-expressions.info/anchors.html) 2.) `([^<]*)*` [see here](http://www.rexegg.com/regex-explosive-quantifiers.html) 3.) [example with lookahead](https://regex101.com/r/sV5yM6/1) (see explanation top right). It's not recommended to parse html with regex, especially if nested. – Jonny 5 Dec 23 '14 at 17:03
  • downvoting... mature "doesn't show research effort unclear or unuseful", for a place thats meant to be accessible and help people it can be unfriendly. Unnecessary downvote with no explanation or ownership to it. I have provided examples of what I have tried and even further explained in my comments as to why I required this. – Jacob Dec 27 '14 at 14:22

1 Answers1

1

Try this :

xmllint --html --xpath '//*[@action="POST"]' file_or_URL
Gilles Quénot
  • 173,512
  • 41
  • 224
  • 223