I'm extracting content from a web page using Yahoo Pipes. For some reason, the developer placed the article content within <h2>
tags and I'm having difficulty getting the content from there.
The content looks like this:
<div id="divid"><h2>
<p>Some content<p>
<p>Some more content</p>
</h2>
<!-- some more stuff here -->
</div>
When I use //div[@id='divid']
I can fetch the content of the whole <div>
block, but when I try //div[@id='divid']//h2
or //div[@id='divid']//h2/text()
I get nothing.
What am I doing wrong and how can I get the content between the <h2>
tags correctly?
You may want to check the actual web page.