Is there any similar method like (accessions = doc.at_xpath('//Node/Childtag').content)
for Nokogiri::XML::SAX::Document?
I have XML like:
<accession>Police-1234</accession>
<accession>Police-6574</accession>
<police>
<privateCar>
<fullName>BMW 750Li</fullName>
</privateCar>
<officeCar>
<fullName>Ford Mustang GT</fullName>
</officeCar>
<optional>
<fullName>Porsche carrera 511</fullName>
</optional>
</police>
My code is some what like:
require 'rubygems'
require 'nokogiri'
include Nokogiri
class PostCallbacks < XML::SAX::Document
def initialize
@in_title = false
@in_title2 = false
end
def start_element(element, attributes)
@attrs = attributes
@content = ''
@in_title = element.eql?("accession")
# Collecting all the other nodes/tags
@in_title2 = element.eql?("fullName")
end
def end_document
# puts "Here is where the attributes could be played with"
end
def characters string
string.strip!
if @in_title and !string.empty?
puts "Accession: #{string}"
elsif @in_title2 and !string.empty?
puts "Full Name: #{string}"
end
@content << string if @content
end
end
parser = XML::SAX::Parser.new(PostCallbacks.new)
parser.parse(File.open(ARGV[0]))
My results are:
Accessions:Police-1234
Accessions:Police-6574
Full Name: BMW 750Li
Full Name: Ford Mustang GT
Full Name: Porsche carrera 511
Now I have two questions.
- How do I only restrict collecting the "accession" element with value "Police-1234".
- I want to only retrieve the full name of the privateCar's child. i.e I want only BMW 750Li as my result.
For the first point, I generally use doc.xpath(//accession).first
to pull out the first entry in the XML.
For the second point, I know I can select it using XPath with doc.at_xpath(//police/privateCar/fullName)
, but is there something similar for the SAX parser?
I am using SAX since I have a large XML file to be parsed.