1

I try to select an element from an SVG document by a special attribute. I set up a simple example.

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg xmlns:svg="http://www.w3.org/2000/svg" xmlns="http://www.w3.org/2000/svg">
  <g id='1'>
    <path id='2' type='A'/>
    <rect id='3' type='B'/>
  </g>
</svg>

Now I use the following syntax to retrieve the path element by its attribute "type":

require 'rexml/document'
include REXML
xmlfile = File.new "xml_as_specified_above.svg"
xmldoc = Document.new(xmlfile)
XPath.match( xmldoc.root, "//path[@type]" )

Syntax directly from http://www.w3schools.com/xpath/xpath_syntax.asp. I would expect that this expression selects the path element but this is what follows:

>> XPath.match( xmldoc.root, "//path[@type]" )
=> []

So, what is the correct syntax in XPath to address the path element by it's attribute? Or is there a bug in REXML (using 3.1.7.3)? Plus points for also retrieving the "rect" element.

Pascal
  • 2,197
  • 3
  • 24
  • 34
  • I've just tried your code above and it works fine here. Does it work if you use the simpler "//path" XPath without requiring the `type` attribute? – mikej Feb 04 '11 at 13:04
  • Are you sure 3.1.7.3 is being used and there isn't an older version lurking somewhere in your Ruby path? Try checking the output of `puts XPath::VERSION` – mikej Feb 04 '11 at 13:10
  • `XPath::VERSION` is 1.8.7. The `//path` XPath works as expected and gives the path element. – Pascal Feb 04 '11 at 13:17
  • Aha! An older version of rexml is being picked up then. Earlier versions didn't have a `VERSION` constant so the 1.8.7 you're seeing is actually the toplevel `VERSION` constant for the Ruby version and not the version of rexml. The older versions don't support the full XPath spec hence `@type` doesn't work. – mikej Feb 04 '11 at 13:20
  • So if you can put that in a answer I will accept it - thanks! – Pascal Feb 04 '11 at 13:24

4 Answers4

3

It looks like an older version of rexml is being picked up that doesn't support the full XPath spec.

Try checking the output of puts XPath::VERSION to ensure that 3.1.73 is displayed.

mikej
  • 65,295
  • 17
  • 152
  • 131
  • This answer is wrong because by no means `//path[@type]` should select any element in the provided input sample. If that were posible, then you must not use this XPath engine any more because is not standard complain. –  Feb 04 '11 at 15:42
0

You need to take the default namespace into account. With XPath 1.0 you need to bind a prefix (e.g. svg) to the namespace URI http://www.w3.org/2000/svg and then use a path like //svg:path[@type]. How you bind a prefix to a URI for XPath evaluation depends on the XPath API you use, I am afraid I don't know how that is done with your Ruby API, if you don't find a method or property in the API documentation yourself then maybe someone else comes along later to tell us.

Martin Honnen
  • 160,499
  • 6
  • 90
  • 110
  • But isn't the default namespace of the document already bound to http://www.w3.org/2000/svg? Additionally the element can be accessed with `//path` as XPath element. So in my opinion `//path[@type]` should work as well. – Pascal Feb 04 '11 at 13:05
  • With XPath 1.0 the path `foo` selects elements with local name `foo` in no namespace. The SVG elements are in a certain namespace so with XPath 1.0 you need to bind a prefix to the SVG namespace URI and use that prefix. That is how XPath 1.0 is defined and operates. As I said, I don't know your implementation and whether it conforms to the standard. – Martin Honnen Feb 04 '11 at 13:47
  • +1 Correct answer. **This is the big XPath FAQ**: `//path` will select any `path` element under the empty or null namespace URI. –  Feb 04 '11 at 15:44
0

Many of us use Nokogiri these days instead of ReXML or Hpricot, another early Ruby XML parser.

Nokogiri supports both XPath, and CSS accessors, so you can use familiar HTML type paths to get at nodes:

require 'nokogiri'

svg = %q{<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg xmlns:svg="http://www.w3.org/2000/svg" xmlns="http://www.w3.org/2000/svg">
  <g id='1'>
    <path id='2' type='A'/>
    <rect id='3' type='B'/>
  </g>
</svg>
}

doc = Nokogiri::XML(svg)
puts doc.search('//svg:path[@type]')
puts doc.search('svg|path[@type]')
puts doc.search('path[@type]')

puts doc.search('//svg:rect')
puts doc.search('//svg:rect[@type]')
puts doc.search('//svg:rect[@rect="B"]')
puts doc.search('svg|rect')
puts doc.search('rect')

# >> <path id="2" type="A"/>
# >> <path id="2" type="A"/>
# >> <path id="2" type="A"/>

# >> <rect id="3" type="B"/>
# >> <rect id="3" type="B"/>
# >> <rect id="3" type="B"/>
# >> <rect id="3" type="B"/>

The first path is XPath with the namespace. The second is CSS with a namespace. The third is CSS without namespaces. Nokogiri, being friendly to humans, will allow us to deal and dispense with the namespaces a couple ways, assuming we are aware of why namespaces are good.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
0

This is the most FAQ: default namespace issue.

Solution:

Instead of:

//path[@type]

use

//svg:path[@type]
Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431