I am trying to extract xml codes from html source. source is like this;
.
.
.
<h5>
<u>A</u>
</h5>
<ul class="listss">
<li>
<d>
<a href="link">
linktext
</a>
</d>
</li>
<li>
<d>
<a href="link2">
linktext2
</a>
</d>
</li>
</ul>
<h5>
<u>B</u>
</h5>
<ul class="listss">
.\
.(SAME TAGS AS ABOVE)
./
</ul>
<h5>
<u>C</u>
</h5>
<ul class="listss">
.\
.(SAME TAGS AS ABOVE)
./
</ul>
<h5>
<u>D</u>
</h5>
<ul class="listss">
.\
.(SAME TAGS AS ABOVE)
./
</ul>
Actually i need parent child relation so i need to extract node cell with xpath node first. But i couldn't achive to get range of xml code from "h5" to "/ul". So i need "h5" and "ul" tags together. Output must be like this;
<h5>
<u>A</u>
</h5>
<ul class="listss">
<li>
<d>
<a href="link">
linktext
</a>
</d>
</li>
<li>
<d>
<a href="link2">
linktext2
</a>
</d>
</li>
</ul>
I searched tons of links and tried everything but none of these xpath codes worked;
/.../*[self::dns:h5 or self::dns:ul]
/.../*[self::dns:h5|self::dns:ul]
/.../*[self::h5 or self::ul]
Any idea, thanks.