Check if XML Element has children or not, in ElementTree

Question

I retrieve an XML documents this way:

import xml.etree.ElementTree as ET

root = ET.parse(urllib2.urlopen(url))
for child in root.findall("item"):
  a1 = child[0].text # ok
  a2 = child[1].text # ok
  a3 = child[2].text # ok
  a4 = child[3].text # BOOM
  # ...

The XML looks like this:

<item>
  <a1>value1</a1>
  <a2>value2</a2>
  <a3>value3</a3>
  <a4>
    <a11>value222</a11>
    <a22>value22</a22>
  </a4>
</item>

How do I check if a4 (in this particular case, but it might've been any other element) has children?

jlr · Accepted Answer · 2014-09-20T17:50:25.830

16

You could try the list function on the element:

>>> xml = """<item>
  <a1>value1</a1>
  <a2>value2</a2>
  <a3>value3</a3>
  <a4>
    <a11>value222</a11>
    <a22>value22</a22>
  </a4>
</item>"""
>>> root = ET.fromstring(xml)
>>> list(root[0])
[]
>>> list(root[3])
[<Element 'a11' at 0x2321e10>, <Element 'a22' at 0x2321e48>]
>>> len(list(root[3]))
2
>>> print "has children" if len(list(root[3])) else "no child"
has children
>>> print "has children" if len(list(root[2])) else "no child"
no child
>>> # Or simpler, without a call to list within len, it also works:
>>> print "has children" if len(root[3]) else "no child"
has children

I modified your sample because the findall function call on the item root did not work (as findall will search for direct descendants, and not the current element). If you want to access text of the subchildren afterward in your working program, you could do:

for child in root.findall("item"):
  # if there are children, get their text content as well.
  if len(child): 
    for subchild in child:
      subchild.text
  # else just get the current child text.
  else:
    child.text

This would be a good fit for a recursive though.

edited Sep 20 '14 at 17:50

answered Sep 20 '14 at 16:14

jlr

1,362
10
17

doesn't work. Could you use my example with iteration? – Incerteza Sep 20 '14 at 16:28
2

it does not work, because your iteration loop yields no elements, since there are no elements named 'item' – marscher Sep 20 '14 at 16:36
how do I get "" and "" elements? – Incerteza Sep 20 '14 at 16:44
It works, check this pythonfiddle: http://pythonfiddle.com/check-if-element-has-children-or-not Else tell me exactly what did not work. Your sample did not work though, hence why I modified it. Let me modify my answer to tell you how to access the subchildren. – jlr Sep 20 '14 at 17:34

Mad Physicist · Answer 2 · 2016-07-22T18:20:32.593

8

The simplest way I have been able to find is to use the bool value of the element directly. This means you can use a4 in a conditional statement as-is:

a4 = Element('a4')
if a4:
    print('Has kids')
else:
    print('No kids yet')

a4.append(Element('x'))
if a4:
    print('Has kids now')
else:
    print('Still no kids')

Running this code will print

No kids yet
Has kids now

The boolean value of an element does not say anything about text, tail or attributes. It only indicates the presence or absence of children, which is what the original question was asking.

edited Jul 22 '16 at 18:20

answered Jul 22 '16 at 18:13

Mad Physicist

107,652
25
181
264

2

Has been deprecated for years due to confusion whether there is an element object or not: https://lxml.de/tutorial.html#elements-are-lists – Gringo Suave May 12 '22 at 03:14
1

@GringoSuave. Apparently my answer was 8 years out of date even when I wrote it. Just now I'm working on my first project actually using lxml, before that I only read some old docs :) – Mad Physicist May 12 '22 at 03:20

score 3 · Answer 3 · edited Dec 17 '17 at 13:14

I would personally recommend that you use an xml parser that fully supports xpath expressions. The subset supported by xml.etree is insufficient for tasks like this.

For example, in lxml I can do:

"give me all children of the children of the <item> node":

doc.xpath('//item/*/child::*') #equivalent to '//item/*/*', if you're being terse
Out[18]: [<Element a11 at 0x7f60ec1c1348>, <Element a22 at 0x7f60ec1c1888>]

or,

"give me all of <item>'s children that have no children themselves":

doc.xpath('/item/*[count(child::*) = 0]')
Out[20]: 
[<Element a1 at 0x7f60ec1c1588>,
 <Element a2 at 0x7f60ec1c15c8>,
 <Element a3 at 0x7f60ec1c1608>]

or,

"give me ALL of the elements that don't have any children":

doc.xpath('//*[count(child::*) = 0]')
Out[29]: 
[<Element a1 at 0x7f60ec1c1588>,
 <Element a2 at 0x7f60ec1c15c8>,
 <Element a3 at 0x7f60ec1c1608>,
 <Element a11 at 0x7f60ec1c1348>,
 <Element a22 at 0x7f60ec1c1888>]

# and if I only care about the text from those nodes...
doc.xpath('//*[count(child::*) = 0]/text()')
Out[30]: ['value1', 'value2', 'value3', 'value222', 'value22']

Suggesting lxml assumes there is a problem with performance and xpath features are lacking. It's definitely better than ElementTree but I wouldn't go this way if there is no problem with the latter, especially considering that lxml requires installation and it's not always a nice walk in the park. — jlr, Sep 20 '14 at 17:47
Performance is a thing, yes, but full xpath support means that you do all the work of selecting nodes in one compact place. xpath queries take me a few seconds to write; writing python code to walk the tree and select the nodes I want takes longer and is far likelier to generate bugs. There are lots of benefits other than performance. — roippi, Sep 20 '14 at 17:56

score 3 · Answer 4 · answered Mar 16 '22 at 17:40

3

As of today, using Python 3.9 you can use the len() function on an ElementTree element.

In this case, for example:

if len(child[3]) > 0:
    a4 = child[3].text

answered Mar 16 '22 at 17:40

RonnieSH

31
1

This is the correct, modern solution. Does not use deprecated test, or build a list to be thrown away. – Gringo Suave May 12 '22 at 03:12
Looks like lxml supports this as well, probably earlier but not sure exactly when. – Gringo Suave Jul 25 '22 at 20:17

marscher · Answer 5 · 2022-06-07T20:56:44.403

Update 2022:

The elements are iterable, and also the boolean operation is implemented. So you can directly use it to check whether an element as children like this:

import xml.etree.ElementTree as ET
xml = """<item>
  <a1>value1</a1>
  <a2>value2</a2>
  <a3>value3</a3>
  <a4>
    <a11>value222</a11>
    <a22>value22</a22>
  </a4>
</item>"""
root = ET.fromstring(xml)

def traverse(e_):
    for c in e_:
        if not c:  # element c has no children
            print(f"text of element: '{c.text}'")
        else:
            traverse(c)

traverse(root)

Produces the following output:

traverse(root)

text of element: 'value1'
text of element: 'value2'
text of element: 'value3'
text of element: 'value222'
text of element: 'value22'

Deprecated solution (Python 2.7, <Python 3.9):

The element class has the get children method. So you should use something like this, to check if there are children and store result in a dictionary by key=tag name:

result = {}
for child in root.findall("item"):
   if child.getchildren() == []:
      result[child.tag] = child.text

`getchildren` is deprecated though since version 2.7. [From the documentation](https://docs.python.org/2/library/xml.etree.elementtree.html): Use list(elem) or iteration. — jlr, Sep 20 '14 at 16:15
`getchildren` is not just deprecated. It was removed in Python 3.9. — mzjn, May 25 '22 at 12:53
You're mentioning *length operation*, but you're not actually using it here, nor explaining how to use it. Can you provide an updated example? — not2qubit, Jun 06 '22 at 10:52

score 0 · Answer 6 · answered May 21 '18 at 11:17

0

You can use the iter method

import xml.etree.ElementTree as ET

etree = ET.parse('file.xml')
root = etree.getroot()
a = []
for child in root.iter():
    if child.text:
        if len(child.text.split()) > 0:
            a.append(child.text)
print(a)

answered May 21 '18 at 11:17

David Córdoba Ruiz

1

score 0 · Answer 7 · answered Apr 24 '21 at 07:44

0

It is possible to use a very simple method

list(<element>)

if list is empty then there is no child there.

answered Apr 24 '21 at 07:44

Sergey Solod

695
7
15

Can you explain and also show how to use it on the example from above? – not2qubit Jun 06 '22 at 10:43

Check if XML Element has children or not, in ElementTree

7 Answers7