I'm working on a system that that will be acting as an OLAP engine for a simulation toolchain dataset. The tools generate their results in XML.
The easiest and most simple solution to me would have been to simply use spark-xml to access the XML files directly with python, Scala, etc. But the problem is that the project owners want to use C# as that is what the original simulation toolchain is built in. I know there is SparkCLR for C# but I don't know of a good way of using Spark-XML within C#.
Does anyone have any suggestions on how to do this? If not I guess the next option would be to translate the datasets into something more native for SparkCLR but not sure of the best approach.