Hpricot (and Nokogiri, because it support's Hpricot's shortcuts) supports two shortcut methods for "search", (/
) and "at" (%
).
Search
means "find all occurrences of this pattern" and at
means find the first occurrence. Search
returns a list of nodes, while at
returns a single node, which you have to keep in mind when you want to access the contents of the node.
Generally, at
is good for tags or IDs you know to be unique and will not want to iterate over. Search
is for things like walking over all rows in a table, or every <p>
tag in a document. You can also chain from an %
, which is useful for finding a particular node, then descending into it.
require 'hpricot'
html = '
<html>
<head><title>blah</title>
<body>
<div id="foo">
<p>paragraph1</p>
<p>paragraph2</p>
</div>
</body>
</head>
'
doc = Hpricot(html)
doc.at('title').inner_text # => "blah"
(doc / 'p').last.inner_text # => "paragraph2"
(doc % 'p').inner_text # => "paragraph1"
(doc % '#foo').search('p').size # => 2
Personally, I recommend Nokogiri over Hpricot. It supports all the short-cuts but is more full-featured, and very well supported.
And, the shortcuts /
and %
are not parts of any standard that I've seen; They're local to Hpricot, and were inherited by Nokogiri for convenience. I don't remember seeing them in Perl or Python parsers.