I am trying to scrape a webpage with Mechanize, with the following structure:
<div id="searchResultsBox">
<div class="listings-wrap">
<div class="listings-header">
<div class="listing-cat">Category</div>
<div class="listing-name">Name</div>
</div>
<ul class="listings">
<li class="listing">
<a href="/ShowRatings.jsp?tid=1143052">
<span class="listing-cat">
<span class="icon"></span>
TEXT
</span>
<span class="listing-name">
<span class="main">TEXT</span>
<span class="sub">TEXT</span>
</span>
</a>
</li>
...
I want to navigate to the page behind the <a>
HTML element. Right now, I have:
agent = Mechanize.new
page = agent.get("URL")
page = page.at('#searchResultsBox > div.listings-wrap > ul > li:nth-child(1) > a')
but it keeps returning NIL (verified by puts page.class
).
I also tried using sleep
to try to ensure that pages have time to load before continuing.
Is there anything I am doing wrong? I thought using the CSS selector would do the trick.