Parse XML in Python with lxml.etree

Question

How can I parse this site (http://www.tvspielfilm.de/tv-programm/rss/heute2015.xml) with python to get for example the tv programm for today on SAT at 20:15? I've tried the Python library lxml.etree, but I failed:

#!/usr/bin/python
import lxml.etree as ET 
import urllib2

response = urllib2.urlopen('http://www.tvspielfilm.de/tv-programm/rss/heute2015.xml')
xml = response.read()

root = ET.fromstring(xml)

for item in root.findall('SAT'):
    title = item.find('title').text
    print title

If you use findall('SAT') This parser will search for all tags named SAT. Since there isn't such a tag, there would be no result. I recommend you to loop through all items 'item' tag — Vincent Beltman, Nov 26 '14 at 13:00

Arpegius · Accepted Answer · 2014-11-27T08:58:53.833

2

The method Element.findall uses xpath expression as an argument. 'SAT' finds only direct children that are named SAT of the root node, witch is 'rss'. If you need to find a tag anyway in the document use './/SAT'.

The expression './/items' is what you looking for:

#!/usr/bin/python
import lxml.etree as ET 
import urllib2

response = urllib2.urlopen('some/url/to.xml')
xml = response.read()

root = ET.fromstring(xml)

for item in root.findall('.//item'):
    title = item.find('title').text
    print title

edited Nov 27 '14 at 08:58

answered Nov 26 '14 at 13:17

Arpegius

5,817
38
53

for item in root.findall('.*/item'): title = item.find('title').text print title worked for me. – user3531864 Nov 26 '14 at 14:05
@Arpegius can u plz help on this one https://stackoverflow.com/questions/63079625/python-xml-parse-and-getelementsbytagname – sunny babau Jul 24 '20 at 21:18

Parse XML in Python with lxml.etree

1 Answers1