1

Trying to load a model from a CIM/XML file acording to IEC 61970 (Common Information Model, for power systems models), I found a problem; According JAXB´s graphs between elements are provided by @XmlREF @XmlID and these both should be equals to match. But in CIM/RDF the references to a resource through an ID, i.e. rdf:resource="#_37C0E103000D40CD812C47572C31C0AD" contain the "#" character, consequently JAXB is unable to match "GeographicalRegion" vs. "SubGeographicalRegion.Region" when in the rdf:resource atribute the "#" character is present.

Here an example:

<cim:GeographicalRegion rdf:ID="_37C0E103000D40CD812C47572C31C0AD">
<cim:IdentifiedObject.name>GeoRegion</cim:IdentifiedObject.name>
<cim:IdentifiedObject.localName>OpenCIM3bus</cim:IdentifiedObject.localName>
</cim:GeographicalRegion>
<cim:SubGeographicalRegion rdf:ID="_ID_SubGeographicalRegion">
<cim:IdentifiedObject.name>SubRegion</cim:IdentifiedObject.name>
<cim:IdentifiedObject.localName>SubRegion</cim:IdentifiedObject.localName>
<cim:SubGeographicalRegion.Region rdf:resource="#_37C0E103000D40CD812C47572C31C0AD"/>
</cim:SubGeographicalRegion>
Omar Arturo
  • 155
  • 1
  • 7
  • 1
    The examples you give are not legal RDF, and they do not seem to be legal CIM-XML either (I'd never heard of CIM before to be honest, but what I can find about the format does not match your examples). – Jeen Broekstra Mar 16 '14 at 03:58
  • I did modify the example to show in a better way the problem – Omar Arturo Mar 17 '14 at 17:14
  • Are you hung up on using JAXB to process this? Because if you could switch to an RDF API (such as Apache Jena or OpenRDF Sesame) this would probably become quite a bit easier: such tools automatically resolve these differences. – Jeen Broekstra Mar 17 '14 at 20:49
  • We don’t want to do queries in RDF directly as we have found that reasoning and looking over RDF directly is very inefficient when there are about thousands of items xml, for that reason we not chose Jena, Sesame or XSLT since we already have bad experiences with those technology. We rather have been trying to process as objects in an oriented objects language like java. – Omar Arturo Mar 19 '14 at 15:06
  • The difficulty that we have with JAXB is that we need to change the value of an ID attribute of an xml element, before to parse. We are wondering if modifying the RDF file before passing it to unmarshall to change the value of the attribute rdf : resource = "# " by rdf : resource = " " to perform the maching between @ and @ xmlID XmlIDREF in JAXB . But this would also increase the computational cost when you have thousands or more elements or more. – Omar Arturo Mar 19 '14 at 15:07

1 Answers1

0

I realize you're asking for a solution using JAXB, but I would urge you to consider an RDF-based solution as it is more flexible and robust. You're basically trying to reinvent what RDF parsers already have built in. RDF/XML is a difficult format to parse, it doesn't make much sense to try and hack your own parsing together - especially since files that have very different XML structures can express exactly the same information: this only becomes apparent when looking at the level of the RDF. You may find that your JAXB parser workaround works on one CIM/RDF file but completely fails on another.

So, here's an example of how to process your file using the Sesame RDF API. No inferencing is involved, this just parses the file and puts it in an in-memory RDF model, which you can then manipulate and query from any angle.

Assuming the root element of your CIM file looks something like this:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
         xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" 
         xmlns:cim="http://example.org/cim/">

(only a guess of course, but I need prefixes for a proper example)

Then you can do the following, using Sesame's Rio RDF/XML parser:

 String baseURI = "http://example.org/my/file";
 FileInputStream in = new FileInputStream("/path/to/my/cim.rdf"); 
 Model model = Rio.parse(in, baseURI, RDFFormat.RDFXML);

This creates an in-memory RDF model of your document. You can then simply filter-query over that. For example, to print out the properties of all resources that have _37C0E103000D40CD812C47572C31C0AD as their SubGeographicalRegion.Region:

 String CIM_NS = "http://example.org/cim/";
 ValueFactory vf = ValueFactoryImpl.getInstance();
 URI subRegion = vf.createURI(CIM_NS, "SubGeographicalRegion.Region");
 URI res = vf.createURI("http://example.org/my/file#_37C0E103000D40CD812C47572C31C0AD");
 Set<Resource> subs = model.filter(null, subRegion, res).subjects();

 for (Resource sub: subs) {
     System.out.println("resource: " + sub + " has the following properties: ");
     for (URI prop: model.filter(sub, null, null).predicates()) {
          System.out.println(prop + ": " + model.filter(sub, prop, null).objectValue());
     }
 } 

Of course at this point you can also choose to convert the model to some other syntax format for further handling by your application - as you see fit. The point is that the difference between the identifiers with the leading # and without has been resolved for you by the RDF/XML parser.

This is of course personal opinion only, since I don't know the details of your use case, but I think you'll find that this is quite quick and flexible. I should also point out that although the above solution keeps the entire model in memory, you can easily adapt this to a more streaming (and therefore less memory-intensive) approach if you find your files are too big.

Jeen Broekstra
  • 21,642
  • 4
  • 51
  • 73
  • Sorry for comment again so late, I appreciate it very much your advice, but I am aware that you have not catch the issue yet. CIM (Common Information Model) is an ontology very accepted as a standard (IEC 61970) in the power system community, and for that reason we know beforehand how the model to handle is. Our goal is to be able to mapping all data from a CIM XML file to a java model that mirrors the CIM ontology. – Omar Arturo Apr 02 '14 at 19:01
  • I already have built the static java model (classes, hierarchy and associations) according with the CIM standard from the UML definition available in the web page of the model and I used Visual Paradigm tool for generate de java code for it. I was instantiating the java objects with JDOM using the resources and references between resources described in the CIM XML file. By this way I have a toolbox to do more complex and efficient operations in the java classes. – Omar Arturo Apr 02 '14 at 19:02
  • Why we want to use JAXB rather Jena, Sesame or XSLT? First, with JAXB it is not need it to create searches for each query, to obtain the value of each attribute o association in the RDF´s elements. However, with Jena, Sesame or XSLT, each query in the RDF file is equivalent to a search by loops in the complete RDF file, it causes that each query make a search that results in a critical time consuming task. – Omar Arturo Apr 02 '14 at 19:02
  • On the other hand, when JAXB found an element, this is mapping to the java model directly and once in java the queries and consequently the searches are more efficient since the searches are by lists rather by a several searches in the overall graph as in RDF. Secondly, I not need to manage de model as a RDF file, because as I mention it above the model can be handle better in java to do more complex operations in an efficient way. – Omar Arturo Apr 02 '14 at 19:06
  • I want to treat natively the RDF as a XML file and by using the JAXB´s function '@XmlID' y '@XmlIDREF' to match the elements that have the same ID. Why I need some help? Because I already got to upload completely the model from RDF to the java model using JAXB, doing a direct mapping. But the references between objects don’t work because the IDs used to do the mapping are not equals, since the references to another resources are by the attribute, where the value is the character “#”plus ID. – Omar Arturo Apr 02 '14 at 19:09
  • What I really need? I need a solution using JAXB that allows to handle the ID like if the rdf:resource attributes don´t have the “#” symbol, or removing it the “#” symbol before JAXB do the parse. – Omar Arturo Apr 02 '14 at 19:10
  • I'm not JAXB expert so I can't help you with that I'm afraid. All I can do is repeat that I think you are underestimating the performance of modern RDF tools. You say: "each query is equivalent to a search by loops in the complete RDF file" which is not at all how Sesame (or the solution I show in the answer above) works. But we're getting far afield here. Sorry I couldn't be of more help - I'll leave the answer here as it may convince others that this is a good alternative. I hope someone else can help you with a JAXB-based solution. – Jeen Broekstra Apr 02 '14 at 19:16