lxml doc.find returning None

Question

I have this xml output, I want to extract few elements.

Sample XML :

<?xml version="1.0" encoding="UTF-8"?>
<rpc-reply xmlns="urn:ietf:2.0" message-id="101">
  <data>
    <bl1 xmlns="http://example.com/ns/xyz/xxx-op">
      <A>
        <B>
          <C>
            <D>
              <E>0.0.0.0</E>
              <F>1200</F>
              <C>
                <G>0</G>
                <H>0</H>
                <I>0</I>
                <J>0</J>
                <K>0</K>
                <L>0</L>
                <M>0</M>
                <N>0</N>
                <O>0</O>
                <P>0</P>
                <Q>0</Q>
                <R>0</R>
                <S>0</S>
                <T>0</T>
                <U>0</U>
                <V>0</V>
                <W>0</W>
                <X>0</X>
              </C>
              <Y>1.1.1.1</Y>
              <Z>IPv6</Z>
            </D>
          </C>
        </B>
      </A>
    </bl1>
  </data>
</rpc-reply>

I tried the following snippet

Code

from lxml import etree
doc = etree.parse("sample.xml")
print doc
memoryElem = doc.find('Y')
print memoryElem
print memoryElem.text        # element text

Somehow, this is not working. memoryElem is printing None. Can you correct me where I'm wrong?

Your edit to the title was fine, but I rolled back because you removed vital code. — zondo, Jun 03 '16 at 10:49

har07 · Answer 1 · 2016-06-02T11:26:31.860

3

Your target element is in the default namespace :

xmlns="http://example.com/ns/xyz/xxx-op"

You need to map a prefix to the default namespace URI, and use that prefix to reference element in the namespace :

ns = {'d': 'http://example.com/ns/xyz/xxx-op'}
memoryElem = doc.find('.//d:Y', ns)
print memoryElem.text

edited Jun 02 '16 at 11:26

answered Jun 02 '16 at 08:42

har07

88,338
12
84
137

Why do you use `d` as the namespace prefix? Is this arbitrary? – Jun 02 '16 at 08:43
1

@LutzHorn Yes, it is arbitrary, as long as it mapped to the correct namespace URI (my idea was 'd' for 'default') – har07 Jun 02 '16 at 08:44

score 0 · Answer 2 · answered Jun 02 '16 at 08:42

0

This is perhaps a namespace problem.

You could try BeautifulSoup:

import bs4

soup = bs4.BeautifulSoup(open("sample.xml", "r").read(), features="xml")
yt = soup.find("Y").text
print(yt)

Output:

1.1.1.1

answered Jun 02 '16 at 08:42

lxml doc.find returning None

2 Answers2