38

What XPath query will select the <media:thumbnail /> node in the following XML?

<item>
  <title>Sublime Federer crushes Wawrinka</title>
  <description>Defending champion Roger Federer cruises past Stanislas Wawrinka 6-1 6-3 6-3 to take his place in the Australian Open semi-finals.</description>
  <link>http://news.bbc.co.uk/go/rss/-/sport2/hi/tennis/9372592.stm</link>
  <guid isPermaLink="false">http://news.bbc.co.uk/sport1/hi/tennis/9372592.stm</guid>
  <pubDate>Tue, 25 Jan 2011 04:21:23 GMT</pubDate>
  <category>Tennis</category>
  <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/50933000/jpg/_50933894_011104979-1.jpg"/>
</item>

The XML came from this RSS feed.

Pops
  • 30,199
  • 37
  • 136
  • 151
Izmoto
  • 1,939
  • 2
  • 17
  • 21
  • 4
    That "node with a colon" is a node using an **XML namespace** (defined in the node) - you need to get a grip on what those are and how to deal with them - see http://www.intertwingly.net/stories/2002/09/09/gentleIntroductionToNamespaces.html – marc_s Jan 27 '11 at 14:00
  • What programming language / system / environment are you using?? – marc_s Jan 27 '11 at 14:03
  • @marc_s: Thanks for your quick response. I'm using c++/libxml2 – Izmoto Jan 27 '11 at 14:11
  • I just figured I can do //@url to get all url elements – Izmoto Jan 27 '11 at 14:39
  • yes, selecting an attribute will work, since those are typically not in any XML namespace...; unfortunately, I'm neither fluent in C++ nor do I know libxml2 :-( so I can't really help you here. Check your documentation on how to define and use XML namespaces when selecting XML using XPath! – marc_s Jan 27 '11 at 15:16
  • @marc_s: thanks so very much for your useful input. – Izmoto Jan 27 '11 at 15:19
  • First exact duplicate in google search [Use XPath to parse element name containing a colon](http://stackoverflow.com/questions/4282147/use-xpath-to-parse-element-name-containing-a-colon) –  Jan 27 '11 at 16:30

3 Answers3

58

You need to learn about namespaces and how to define/register a namespace in your XPath engine so that you can then use the associated prefix for names in that registered namespace. There are plenty of questions in the xpath tag asking how to use names that are in a namespace -- with good answers. Search for them.

A very rough answer (ignoring namespaces at all) is:

//*[name()='media:thumbnail']
Community
  • 1
  • 1
Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
  • how to write the xpath if media:thumbnail had child element with media:image and we had to get media:image element – bkk Apr 17 '13 at 07:48
  • 5
    @bhatkrishnakishor, Use `//*[name()='media:thumbnail']/*[name() = 'media:image']` – Dimitre Novatchev Apr 17 '13 at 14:33
  • /item/*[local-name()='thumbnail'] is what worked for me. – Skystrider Oct 23 '19 at 17:25
  • I'm looping a XmlNodeList but on every loop, I get the same child node which is the first one, I have no clue why this happeing – Dushyanth Kandiah Aug 25 '20 at 10:20
  • @DushyanthKandiah: Please, ask a question and provide all relevant code and data. Also, explain there what you want to select with the XPath expression. – Dimitre Novatchev Aug 25 '20 at 15:09
  • @DimitreNovatchev this line of code worked for me `*[local-name()='thumbnail']` – Dushyanth Kandiah Aug 28 '20 at 10:53
  • @DushyanthKandiah, This means that you have elements named `thumbnail` that are in more than one namespace. Generally I discourage people from using `local-name()` because they may not be aware of the different namespace issue and if they were, they would probably only want to select elements in a particular (not all) namespace. – Dimitre Novatchev Aug 28 '20 at 15:17
7

What worked for me is:

/item/*[local-name()='thumbnail']
Stefan van den Akker
  • 6,661
  • 7
  • 48
  • 63
Skystrider
  • 389
  • 3
  • 13
  • 1
    While the accepted answer is a better answer in that it is more thorough and explains what is going on, this is the better answer because the solution works. – rawkintrevo Nov 15 '19 at 11:21
  • lol ya that's because I don't really understand. I just noticed local-name() in completely unrelated xpath and since name() wasn't working for me I tried local-name() and it worked. So the accepted answer is much better, in situations where name() works. – Skystrider Jan 28 '20 at 17:43
  • For whatever reason this did not work with .NET System.Xml namespace tools in C#. I had to specifically add the namespace to my XmlDoc object. By the way, the namespace it is using for "media:thumbnail" is: xmlns: media = 'http://search.yahoo.com/mrss/' – qxotk Mar 31 '20 at 17:09
1

If you're looping an XmlNodeList array just use *[local-name()='thumbnail']