I am working with a multitude of different xml files where I do not know the iteration element within the file.
What I mean with iteration element is the element that is repeated throughout the xml file (also seen in xsd-fiels as maxOccurs="unbounded").
For example an orders file might contain a repeated element called order
Some examples of the structures I receive are
<order>
<order>...</order>
<order>...</order>
</orders>
<products>
<product>...</product>
<product>...</product>
</products>
<root>
<element>...</element>
<element>...</element>
</root>
<products>
<section>
<someelement>content</someelement>
<item>...</item>
<item>...</item>
<item>...</item>
<item>...</item>
</section>
</products>
In the above example the iterators/repeaters are called:
orders > order
products > product
root > element
products > section > item
My usual way to estimate the iterator is to load the full xml file into an xmldocument from that generate and xsd schema and from it find the first maxOccurs with subelements within it. This works fine, but using xmldocument doesn't work well with very large xml files (gb-size).
For these I need to use a xmlreader, but I have no idea on how I could approach the estimation of the iterator with a xmlreader since I can't use the xsd trick.
So looking for input on how to estimate it, any ideas are appreciated