3

I was unable to find this question specifically, hopefully I'm not wrong about it being a new variation on an older question.

I'm hoping to be able to select the table after the (inconsistent) p.red element text(), where the 'p' does not contain the text "Alphabetical" but does contain the text "OVERALL" ..

The DOM looks something like this:

<p class=red>Some Text</p>
  <table class="newclass">
  <tr></tr>
  <tr></tr>
</table>

<p class=red>Some Text</p>
<table class="newclass">
  <tr></tr>
  <tr></tr>
</table>

<p class=red>OVERALL</p>
<table class="newclass">
  <tr></tr>
  <tr></tr>
</table>
  • the table comes in at different counts for each page.

I want to get that p tag's text() but also get the table directly after it. Again, where the text() contains "OVERALL" but not "ALPHABETICAL" .. should I build an array and .reject() the elements without matches? I'm not sure at the moment and I'm fairly new to using Ruby and Mechanize, thanks in advance for any help!

2 Answers2

2

Using Nokogiri's CSS evaluation is nice and clean:

require 'nokogiri'

doc = Nokogiri::HTML(<<EOT)
<p class=red>Some Text</p>
  <table class="newclass">
  <tr></tr>
  <tr></tr>
</table>

<p class=red>Some Text</p>
<table class="newclass">
  <tr></tr>
  <tr></tr>
</table>

<p class=red>OVERALL</p>
<table class="newclass">
  <tr></tr>
  <tr></tr>
</table>
EOT

puts doc.at('p:contains("OVERALL")').to_html
# >> <p class="red">OVERALL</p>

puts doc.at('p:contains("OVERALL") ~ table').to_html
# >> <table class="newclass">
# >> <tr></tr>
# >> <tr></tr>
# >> </table>
the Tin Man
  • 158,662
  • 42
  • 215
  • 303
1

The p tag:

agent.parser.xpath('//p[.="OVERALL"]')[0]

the table after it:

agent.parser.xpath('//p[.="OVERALL"]')[0].next.next

or:

agent.parser.xpath('//p[.="OVERALL"]/following-sibling::table[1]')[0]
pguardiario
  • 53,827
  • 19
  • 119
  • 159
  • Just a hint for those who want to be able to find next tag in a Mechanise object. parser.xpath when your agent is created like `agent = Mechanize.new` . You need to add – Bart C Oct 16 '15 at 05:56
  • Accidentally submitted the previous comment and couldn't change after 5 minutes. Just a hint for those who want to be able to find the next tag in a Mechanise object. `parser` is a Nokogiri method so you have to make sure that your object is `Nokogiri::XML::Element` when calling `class` on it. If your agent is created like `agent = Mechanize.new` agent.parser.xpath will not work (at least in Mechanise 2.7.3) and will return an error `NameError: undefined local variable or method `parser' for main:Object`. `agent.page.parser.path` however will work. – Bart C Oct 16 '15 at 07:45
  • Link to a useful post related to the previous comment http://stackoverflow.com/questions/23064821/using-the-mechanize-gem-with-the-nokogirl-gem?rq=1 – Bart C Oct 16 '15 at 07:52