I'm investigating use cases for using streaming in XSL. I know of two clear cases:
A. You need to transform a very large document, the entirety of which cannot be held in memory. B. You only need a small part of the document, and often that "small part" is near the top. You can then save time via early exit.
I'm writing to ask if, in practice, there is a third real use case:
C. You have a simple transformation and want to forgo the CPU time required to build the XML tree. To give an example, imagine a store's shipments are stored in an XML structure with the following format:
Top-level = Year
2nd level = Month
3rd level = Day of shipment
4th level = Shipment ID
5th level = Individual items in shipment
Just for sake of example, consider a transformation whose purpose is to pull information at the "month" level.... only needing data stored in attributes of the month elements, and not needing any information about the descendants of these nodes.
Is it possible that such a transformation could benefit from streaming, even though the entire document must be read? I was hoping that some time might be gained because there is no need to build trees, but in my limited testing it appears this is not the case.
I tried such an example in SAXON 9.5.1.3, and streaming was about 20% slower than a non-streaming example. Perhaps the overhead involved with executing streaming will almost always be worse than the time gained by not building trees? (At least in SAXON, where tree building is very fast.)
Or am I making an error in my testing, and there are clear examples where streaming is more efficient, even when the entire document has to be read?