I am seeing "SAXParseException: output_pdml.xml:1089:0: not well-formed (invalid token)" error message while parsing pcap log in PDML .xml file format. Actual root cause of this failure is due to this string in the decoded log message "". The characters "<><>" is making actual failure. I was going through the xml sax documentation in python, there is escape function available to replace these string in its supporting format
xml.sax.saxutils.escape(data, entities={}) Escape '&', '<', and '>' in a string of data. But I am not able to get how to invoke this function , as xml sax is completely working in event driven approach.In my code most of the parsing activities happening though startElement and endElement functions. I feel this exception is raised internally while invoking the startElement function
def startElement(self, data, attr): In startElement function just checking the type of the data element and it created a python object structure, I assume error will happen during these steps. My question here is how to invoke the escape while creating the parser instance and invoking the content handler
parser = xml.sax.make_parser()
parser.setContentHandler(content_handler_fun(call_back))
parser.setFeature(xml.sax.handler.feature_external_ges, False)
parser.parse(stdout_stream)
How do I invoke the escape function before invoking the parse() function
Tried for calling escape() method like this
parser = xml.sax.make_parser()
parser.setContentHandler(cont`your text`ent_handler_fun(call_back))
parser.setFeature(xml.sax.handler.feature_external_ges, False)
data = xml.sax.saxutils.escape(stdout_stream)
parser.parse(data)
Failure observed: '_io.BufferedReader' object has no attribute 'replace' I feel escape() expecting a string data, but my case I have to parse the content of stdout stream, it wont possible to read all stream content in string format and parse, because the content size is too huge, it is not practical to load complete content in to memory instead of referring a stream reference