My application consumes XML data from different vendors. All the vendors have different xml formats/schema and custom queries are required to retrieve various data from those XML.
I initially started out with a RDBMS approach, wherein after retrieving a specific XML from the vendor, i would parse/query the XML and write the data in some tables.(using Woodstock StAX parser). However due to the very nature of RDBMS (fixed schema), i am not able to support all the XML formats from different vendors and even if i do, i have to "normalize" the hierarchical xml into a RDBMS fixed schema relational-data.
The xml/data from the vendors are updated frequently everyday and sizes vary between a few kbs up-to 50 MB data files.
I am evaluating various NXD (Native Xml Databases), eXist-db, Sedna, BaseX and MonetDB as a next step to see if this would suite my purposes.
Can some one please provide some practical advice on how to work this out? or has built similar system, which handles a lot of XML data of different formats/schema.
Here are the core XML requirements i am trying to answer:
- Handles multiple xml data files, from multiple sources. XMLs are different from vendor to vendor.
- XML updates, of the whole document as well as some fields in the existing XML in the DB.
- Identify whether its from a particular vendors and fire the queries accordingly.
- Query these xml using XPath/XQuery to read the data to present it to the users in a common view.
Please advice.
Thanks, Subhro.