I would like to find specific tags within a Node which is in a NodeSet but when I used XPath it returns results from the whole NodeSet.
I'm trying to get something like:
{ "head1" => "Volume 1", "head2" => "Volume 2" }
from this HTML:
<h2 class="header">
<a class="header" >head1</a>
</h2>
<table class="volume_description_header" cellspacing="0">
<tbody>
<tr>
<td class="left">Volume 1</td>
</tr>
</tbody>
</table>
<h2 class="header">
<a class="header" >head2</a>
</h2>
<table class="volume_description_header" cellspacing="0">
<tbody>
<tr>
<td class="left">Volume 2</td>
</tr>
</tbody>
</table>
So far I've tried:
require 'nokogiri'
a = File.open("code-above.html") { |f| Nokogiri::HTML(f) }
h = a.xpath('//h2[@class="header"]')
puts h.map { |e| e.next.next }[0].xpath('//td[@class="left"]')
But with this I get:
<td class="left ">Volume 1</td>
<td class="left ">Volume 2</td>
I'm expecting only the first one.
I've tried doing the XPath inside the block but this gives me the the same result twice.
I checked and
puts h.map { |e| e.next.next }[0]
evaluates to the first Node so I don't understand why XPath looks in the whole NodeSet or even the whole Nokogiri::Document, as I think that's what it actually does.
Can somebody please explain me the principles of searching and navigating within a selected Node/NodeSet, not the whole Document? Maybe navigating down a known path would be better in this case but I don't know how to do that either.