1

I'm using Jena to read a RDFa file. After creating the Model I read the RDFa file into the model. (Basic usage of Jena)

When I store the file online and pass the URL to the model, everything works as expected and the contents of the file together with the RDFa information is available for further processing. When I store the file locally I'm able to access the file using the pathname "file:///Users/Piejero/file.xhtml", again everything works fine here. But when I access the same file (locally) using an InputStream (from a File), I get the following error:

Exception in thread "main" org.apache.jena.riot.RiotException: {E202} Expecting XML start or end element(s). String data "Metadata" not allowed. Maybe there should be an rdf:parseType='Literal' for embedding mixed XML content in RDF. Maybe a striping error.

("Metadata" is the of the xhtml page. Encoded using Unicode (UTF-8))

I think that we're dealing with an IO issue, but how can you solve it? From my experimenting I can conclude there is nothing wrong with the file itself?

The code for the failing case is

JenaRdfaReader.inject();
Model model = ModelFactory.createDefaultModel();
File f = new File("/Users/Piejero/file.xhtml");
model.read(new FileInputStream(f), "RDFA");

I'm using Semargl to add RDFa support to Jena.

1 Answers1

1

The problem is probably already resolved, but here is how it works for me. The code is just using a FileReader instead of an InputStream

     JenaRdfaReader.inject();
     Model m = ModelFactory.createDefaultModel();  
     try {
        m.read(new FileReader("C:\\data\\workspaces\\websites\\bla.htm"), "", "RDFA");
    } catch (FileNotFoundException e) { 
        e.printStackTrace();
    }
Joe
  • 146
  • 2
  • 11