0

I want to write a regular expression for the string

<h1><a href="/index.html"><img src="images/images.jpg" width="960" height="64" alt="[some string]" /></a></h1>

and it must be converted to

The alt of a is [some string]

I have used the following regular expression

\Q<h1><a href="/index.html"><img src="images/images.jpg" width="960" height="64" alt="\E(.*?)\Q" /></a></h1>\E

but when I get input like

<h1>
 <a href="/index.html">
   <img src="images/images.jpg" width="960" height="64" alt="[some string]" />
 </a>
</h1>

This regular expression does not works since it has line breaks and spaces after >

How to ignore the spaces and line breaks after the > in the above regular expression.

Thanks in advance!!!

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • You should use an html Parser. No regex. – Jens Feb 15 '16 at 08:17
  • [You cannot parse html with regex, because html is to complex for regex to be parsed in every condition, for example, tags can have multiple orders.](http://stackoverflow.com/a/1732454) – Ferrybig Feb 15 '16 at 08:22

0 Answers0