I am using lxml to read my xml file. I am using a code something like below. It works just fine with lxml2.3 beta1, but with lxml2.3 it gives me zn xml syntax error as shown below. I went through the release notes for both versions, but could not figure out what could have caused this error or how to fix it. Please help if you have come across such a thing or have any clues about it.
Thanks!!
Code:
from lxml import etree
def parseXml(context,attribList,elemList):
for event, element in context:
if element.tag in elemList:
#read element attributes
element.clear()
def main(object):
ns='{NS}'
attribList=['name','age','id']
elemList=[ns+'Employee',ns+'Experience',ns+'Employment',ns+'Project',ns+'Award']
context=etree.iterparse(fullFilePath, events=("start","end"))
parseXml(context,attribList,elemList)
Error:
File "iterparse.pxi", line 478, in lxml.etree.iterparse.next (src/lxml/lxml.etree.c:95348) File "iterparse.pxi", line 530, in lxml.etree.iterparse._read_more_events (src/lxml/lxml.etree.c:95886) File "parser.pxi", line 585, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:71955) XMLSyntaxError: Namespace default prefix was not found, line 545, column 73
xml sample -
<root xmlns='NS'>
<Employee Name="Mr.ZZ" Age="30">
<Experience TotalYears="10" StartDate="2000-01-01" EndDate="2010-12-12">
<Employment id = "1" EndTime="ABC" StartDate="2000-01-01" EndDate="2002-12-12">
<Project Name="ABC_1" Team="4">
</Project>
</Employment>
<Employment id = "2" EndTime="XYZ" StartDate="2003-01-01" EndDate="2010-12-12">
<PromotionStatus>Manager</PromotionStatus>
<Project Name="XYZ_1" Team="7">
<Award>Star Team Member</Award>
</Project>
</Employment>
</Experience>
</Employee>
</root>
The 'Employee' are repeated within the root. And the error happens after the parser has gone though many of the employees correctly.
Edit 1: On capturing the exception, I catch the following:
WARNING:NAMESPACE:NS_ERR_UNDEFINED_NAMESPACE: Namespace default prefix was not found