I am trying to iterate through the elements of an XML document, and firing events on 'start' elements and 'end' elements.
This is pretty straight-forward in using Python's lxml module, and there is even another question on SO regarding this:
Using Python's xml.etree to find element start and end character offsets
#!/usr/bin/python
import re, sys
from lxml import etree
from StringIO import StringIO
dtd = etree.DTD (open (sys.argv [1], "r"))
xml = etree.XML (open (sys.argv [2], "r").read ())
result = dtd.validate (xml)
for error in dtd.error_log.filter_from_errors():
print(error.message)
print(error.line)
print(error.column)
if result == True :
for event, elem in etree.iterwalk (xml, events=('start', 'end')) :
if event == 'start' :
print 'starting element:', elem.tag
elif event == 'end' :
print 'ending element:', elem.tag
if elem is not xml :
print elem.tail
I would like to do essentially the same thing using the tinyxml2 C++ XML library, but I have not had any luck with this so far [specifically finding closing tags].
I prefer tinyxml2 as it is 'tiny', but I am open to other C++ XML libs if they can achieve this end (more easily).
If there is a better way to fire events on 'end tags' I am open to that as well.