Questions tagged [nokogiri]

An HTML, XML, SAX and Reader parser for Ruby with the ability to search documents via XPath or CSS3 selectors… and much more

Nokogiri (鋸) is an HTML, XML, SAX and Reader parser for Ruby. Among Nokogiri’s many features is the ability to search documents via XPath or CSS3 selectors.

See the Nokogiri cheat-sheet for tips using Nokogiri.

A digest of most of the methods documented at nokogiri.org. Reading the source can help, too.

From the Nokogiri readme:

XML is like violence - if it doesn’t solve your problems, you are not using enough of it.

3699 questions
28
votes
2 answers

How do I use XPath in Nokogiri?

I have not found any documentation nor tutorial for that. Does anything like that exist? doc.xpath('//table/tbody[@id="threadbits_forum_251"]/tr') The code above will get me any table, anywhere, that has a tbody child with the attribute id equal…
Radek
  • 13,813
  • 52
  • 161
  • 255
28
votes
4 answers

How do I do a regex search in Nokogiri for text that matches a certain beginning?

Given: require 'rubygems' require 'nokogiri' value = Nokogiri::HTML.parse(<<-HTML_END) "

A

Foo

B

C

Bar

bcolfer
  • 639
  • 1
  • 6
  • 15
27
votes
2 answers

nokogiri failing to upgrade

Anyone seen this? gem update nokogiri Updating installed gems Updating nokogiri Building native extensions. This could take a while... ERROR: Error installing nokogiri: ERROR: Failed to build gem native…
smcracraft
  • 493
  • 6
  • 14
27
votes
8 answers

Nokogiri, open-uri, and Unicode Characters

I'm using Nokogiri and open-uri to grab the contents of the title tag on a webpage, but am having trouble with accented characters. What's the best way to deal with these? Here's what I'm doing: require 'open-uri' require 'nokogiri' doc =…
Moe
  • 641
  • 1
  • 7
  • 16
25
votes
3 answers

How can I get the absolute URL when extracting links using Nokogiri?

I'm using Nokogiri to extract links from a page but I would like to get the absolute path even though the one on the page is a relative one. How can I accomplish this?
Mridang Agarwalla
  • 43,201
  • 71
  • 221
  • 382
24
votes
3 answers

How to get the page source with Mechanize/Nokogiri

I'm logged into a webpage/servlet using Mechanize. I have a page object: jobShortListPg = agent.get(addressOfPage) When I use: puts jobShortListPg I get the "mechanized" version of the page which I don't want: #
Waley Chen
  • 929
  • 3
  • 10
  • 23
24
votes
1 answer

How to add attribute to Nokogiri node?

I'm trying to add an attribute to an existing Nokogiri node. What I've done is this: node.attributes['foobar'] = Nokogiri::XML::Attr.new('foo', 'bar') But I get the error: TypeError Exception: wrong argument type String (expected Data) What is a…
Yuval Karmi
  • 26,277
  • 39
  • 124
  • 175
24
votes
3 answers

Nokogiri vs Hpricot?

Which one would you choose? My important attributes are (not in order): Support and future enhancements. Community and general knowledge base (on the Internet). Comprehensive (I.E., proven to parse a wide range of *.*ml pages). Performance. Memory…
roshan
  • 1,323
  • 18
  • 31
24
votes
3 answers

How to get the raw HTML of a node

I am using Nokogiri to analyze some HTML, but, I don't know how get the raw HTML inside a node. For example, given: 9746
icn
  • 17,126
  • 39
  • 105
  • 141
23
votes
4 answers

gem install nokogiri -v '1.6.8.1' fails

Building a new Rails app and getting a problem with nokogiri. Said to try gem install nokogiri -v '1.6.8.1' which fails with output below. I tried deleting Gemfile.lock and using the Gemfile from another app which has no problem—bundle install still…
Greg
  • 2,359
  • 5
  • 22
  • 35
23
votes
2 answers

Error installing nokogiri 1.6.0 on mac (libxml2)

UPDATE: Fixed I found the answer in another thread. The workaround I used is to tell Nokogiri to use the system libraries instead: NOKOGIRI_USE_SYSTEM_LIBRARIES=1 bundle install ==== Trying to install nokogiri 1.6.0 on a mac. With previous…
Jose Enrique
  • 609
  • 6
  • 10
23
votes
2 answers

How to use xmlns declarations with XPath in Nokogiri

I'm using Nokogiri::XML to parse responses from Amazon SimpleDB. The response is something like:
Mark Rendle
  • 9,274
  • 1
  • 32
  • 58
23
votes
10 answers

Nokogiri error when running bundle install

Trying to get a cloned Rails app running. When running bundle install I get this error: Using mini_portile (0.5.0) Installing nokogiri (1.6.0) Gem::InstallError: nokogiri requires Ruby version >= 1.9.2. An error occurred while installing nokogiri…
wikichen
  • 2,253
  • 3
  • 18
  • 28
23
votes
1 answer

Escape single quote in XPath with Nokogiri?

I have an XPath query that looks like this, with both single and double quotes. How do I escape the apostrophe properly so that the query works? I tried: "//li[text()='Frank's car']" but it doesn't seem to do it for me. Any ideas? …
abhir
  • 1,059
  • 1
  • 9
  • 25