I'm parsing some XML using Python's Expat (by calling parser = xml.parsers.expat.ParserCreate()
and then setting the relevant callbacks to my methods).
It seems that when Expat calls read(nbytes)
to return new data, nbytes
is always 2,048. I have quite a lot of XML to process, and suspect that these small read()s are making the overall process rather slow. As a point of reference, I'm seeing throughput around 9 MB/s on an Intel Xeon X5550, 2.67 GHz running Windows 7.
I've tried setting parser.buffer_text = True
and parser.buffer_size = 65536
, but Expat is still calling the read()
method with an argument of just 2,048.
Is it possible to increase this?