The XML in your example has a root element which is a span
element containing the following child nodes:
- the text node
Master
- a
span
element
- the text node
Part-time, Full-time
- another
span
element
- the text node
On Campus
You say you want to extract the text nodes Part-time, Full-time
and On Campus
? Presumably you want an XPath that you can apply to other similar XML data, and there are different criteria that could return you those same two text nodes. So I'm going to guess that your criteria are you that you want to extract any text node which is preceding by a sibling span
element whose class
attribute is Divider
. The appropriate XPath would be:
/span/text()[preceding-sibling::span/@class='Divider']
That said, I suspect the ElementTree
XPath interface may not work for you, because it doesn't support XPath queries that return text nodes, only elements (that's what I understand, anyway; I'm not a Python programmer). However, I know that the XPath API of lxml.etree
will return text nodes, e.g. https://lxml.de/tutorial.html#using-xpath-to-find-text