I'm processing large (1TB) XML files using the StAX API. Let's assume we have a loop handling some elements:
XMLInputFactory fac = XMLInputFactory.newInstance();
XMLStreamReader reader = fac.createXMLStreamReader(new FileReader(inputFile));
while (true) {
if (reader.nextTag() == XMLStreamConstants.START_ELEMENT){
// handle contents
}
}
How do I keep track of overall progress within the large XML file? Fetching the offset from reader works fine for smaller files:
int offset = reader.getLocation().getCharacterOffset();
but being an Integer offset, it'll probably only work for files up to 2GB...