15

I'm trying to see how libxml implements XPath support, so it made sense to me to test using xmllint. However, the obvious option, --pattern, is somewhat obscure, and I ended up using something like the following:

test.xml: <foo><bar/><bar/></foo>

> xmllint --shell test.xml
/  > dir /foo
ELEMENT foo
/  > dir /foo/*
ELEMENT bar
ELEMENT bar

This seems to work, and that's great, but I'm still curious. What is xmllint's --pattern option for, and how does it work?

Provide an example for full credit. =)

Matthew Lowe
  • 1,350
  • 1
  • 17
  • 29

4 Answers4

24

The seemingly undocumented option --xpath seems to be more useful.

% cat data.xml
<project>
  <name>
    bob
  </name>
  <version>
    1.1.1
  </version>
</project>
% xmllint --xpath '/project/version/text()' data.xml | xargs -i echo -n "{}"
1.1.1
% xmllint --xpath '/project/name/text()' data.xml | xargs -i echo -n "{}"
bob
l0st3d
  • 2,860
  • 1
  • 27
  • 29
  • Hm, it looks like that option was added since I last updated the library. I'll check it out the next time I update. Thanks! – Matthew Lowe Mar 12 '11 at 03:22
7

The hint is in the words "which can be used with the reader interface to the parser": xmllint only uses the reader interface when passed the --stream option:

$ xmllint --stream --pattern /foo/bar test.xml
Node /foo/bar[1] matches pattern /foo/bar
Node /foo/bar matches pattern /foo/bar
npostavs
  • 4,877
  • 1
  • 24
  • 43
3

If you simply want the text value of a number of xml nodes then you could use something like this (if --xpath is not available on your version of xmllint):

./foo.xml:

<hello>
   <world>its alive!!</world>
   <world>and works!!</world>
</hello>

$ xmllint --stream --pattern /hello/world --debug ./foo.xml | grep -A 1 "matches pattern" | grep "#text" | sed 's/.* [0-9] //'
its alive!!
and works!!
Sean
  • 31
  • 1
3

From the xmllint(1) man page:

   --pattern PATTERNVALUE
          Used to exercise the pattern recognition engine, which can be
          used with the reader interface to the parser. It allows to
          select some nodes in the document based on an XPath (subset)
          expression. Used for debugging.

It only understands a subset of XPath and its intention is to aid debugging. The library that does understand XPath fully is libxslt(3) and its command-line tool xsltproc(1).

The ``pattern'' module in libxml "allows to compile and test pattern expressions for nodes either in a tree or based on a parser state" and its documentation lives here: http://xmlsoft.org/html/libxml-pattern.html

Ari.

iter
  • 4,171
  • 8
  • 35
  • 59
  • 4
    I also can read the xmllint man page! However, it doesn't tell me what I want to know. Using xmllint --pattern always seems to spit back the entire doc, which is why I'm asking the question in the first place. Like I said, "provide an example for full credit". – Matthew Lowe Feb 14 '10 at 21:13