2

I'm trying to make the official example for Apache Jena Text to work with an rdf file. The official example is given here.

Honestly, I think that there is too less documentation and the example is too generic. It does not provide a real rdf file to be given as example and there are a lot of things to configure. I'm trying to analyze this RDF file.

--UPDATE--

I found the files used in the official example as mentioned in a comment to this question.

Thus, I defined the following ttl file by mixing the original example with the foaf.rdf file. Now I have the file foaf.ttl:

@prefix :    <http://localhost/jena_example/#> .
@prefix dc:    <http://purl.org/dc/elements/1.1/> .
@prefix con:   <http://www.w3.org/2000/10/swap/pim/contact#> .
@prefix geo:   <http://www.w3.org/2003/01/geo/wgs84_pos#> .
@prefix foaf:  <http://xmlns.com/foaf/0.1/> .
@prefix s:     <http://www.w3.org/2000/01/rdf-schema#> .
@prefix owl:   <http://www.w3.org/2002/07/owl#> .
@prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix cc:    <http://creativecommons.org/ns#> .
@prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .


:T1 rdfs:label "X0 X1 X2" .
:T2 rdfs:label "X10 X11 X12" .

:B1 rdfs:label "X1" .
:B2 foaf:name "X1" .
:B3 foaf:name "Sean" .

:Sean
        a          foaf:Person ;
        foaf:name  "Sean Palmer" .


:Tim_Bray
        a          foaf:Person ;
        foaf:name  "X1" .

:me
        foaf:name  "Oshani Seneviratne" .

:John_Gage
        a          foaf:Person ;
        foaf:img   <http://upload.wikimedia.org/wikipedia/commons/d/de/John_Gage.jpg> ;
        foaf:name  "John Gage" .

Thus, with respect to the original Java file mentioned so far, I set in the main:

public static void main(String [] args){
    TextQuery.init();
    Dataset ds = createCode();
    //Dataset ds = createAssembler() ;
    loadData(ds, "foaf.ttl") ;
    queryData(ds) ;        
}

In the queryData method I have:

             String pre = StrUtils.strjoinNL
        ( "PREFIX : <http://localhost/jena_example/#>"
        , "PREFIX dc: <http://purl.org/dc/elements/1.1/>"
        , "PREFIX con: <http://www.w3.org/2000/10/swap/pim/contact#>"
        , "PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>"
        , "PREFIX foaf: <http://xmlns.com/foaf/0.1/>"
        , "PREFIX s: <http://www.w3.org/2000/01/rdf-schema#>"
        , "PREFIX owl: <http://www.w3.org/2002/07/owl#>"
        , "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>"
        , "PREFIX cc: <http://creativecommons.org/ns#>"
        , "PREFIX text: <http://jena.apache.org/text#>"
        , "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>") ;       


    String qs = StrUtils.strjoinNL
        ( "SELECT * "
        , " { ?res text:query ('X*' 10) ;"
        , "      rdfs:label ?label"
        , " }") ;     

And in createCode() I have:

    // Define the index mapping 
    EntityDefinition entDef = new EntityDefinition("uri", "text", RDFS.label.asNode()) ;

The result is:

-----------------------
| res | label         |
=======================
| :T1 | "X0 X1 X2"    |
| :T2 | "X10 X11 X12" |
| :B1 | "X1"          |
-----------------------

However, I report that X1 was included also in triple:

:B2 foaf:name "X1" .

but B2 is not in the result set. One may say: "You have to define an index". Well, the very strange thing is that if I set in createCode():

    // Define the index mapping 
    EntityDefinition entDef = new EntityDefinition("blablabla", "blablabla", RDFS.label.asNode()) ;

the result doesn't change!

So, what is the role of EntityDefinition? What am I doing wrong?

mat_boy
  • 12,998
  • 22
  • 72
  • 116
  • You're searching for the string `'Text*'`. I'm not sure how regular expressions are handled, but when I look at [the file](https://raw.github.com/RDFLib/rdflib/master/examples/foaf.rdf) that you loaded, I don't even see any occurrence of the string `Text`. Why do you expect results? I do see something with a value for `foaf:name` of `Dean Jackson`. Do you get results with `?about text:query ('Dean' 20)`? – Joshua Taylor Feb 18 '14 at 15:46
  • I'm sorry, I was initally preparing a generic example with a generic RDF file, but then I decided to put the real code. I update the question! Thanks for pointing it out! – mat_boy Feb 18 '14 at 15:49
  • Ok, there are also `John`s in there. What do you get if you search without that `*`? – Joshua Taylor Feb 18 '14 at 15:50
  • Also, since it appears that some [configuration](http://jena.apache.org/documentation/query/text-query.html#configuration) is needed in advance, you'll probably need to provide that, too, if you hope to get useful responses. There's not enough here yet to reproduce the problem that you're seeing. – Joshua Taylor Feb 18 '14 at 15:52
  • I still get no results. – mat_boy Feb 18 '14 at 15:52
  • The DataSet Assembler is disabled in the example given by the apache group. Should I define it? – mat_boy Feb 18 '14 at 15:53
  • I don't know; this isn't a feature that I've used. Some of the Jena developers might see this question though, and if you provide your entire code, as well as your configuration, you'll be more likely to get a useful response. – Joshua Taylor Feb 18 '14 at 15:55
  • @JoshuaTaylor Right! Thanks a lot anyway! I will play a lot with the DataSetAssembler. – mat_boy Feb 18 '14 at 15:58
  • 2
    Btw I agree that the documentation is somewhat weak for this component, I have filed [JENA-644](https://issues.apache.org/jira/browse/JENA-644) to track improving this – RobV Feb 18 '14 at 17:20
  • @RobV I think this may help. I found the source files for the `JenaTextExample1.java`. They are [here](https://apache.googlesource.com/jena/+/trunk/jena-text) and with this files everything works. I have to seee what is wrong with my files. – mat_boy Feb 18 '14 at 18:28

1 Answers1

1

Your problem is primarily down to your entity definition AFAICT, I'm fairly sure with the entity definition you've used your text index will be empty. If you've used a disk based Lucene index you can use a tool like Luke to confirm this.

Your entity definition is as follows:

EntityDefinition entDef = new EntityDefinition("rdf:about", "rdf:resource", RDFS.label.asNode()) ;

Which is problematic in a couple of ways:

  1. You can't use prefixed names for the entityField (the first parameter), you need to use a full URI
  2. rdf:about is not a real URI, it is a syntax detail of RDF/XML so indexing this will always index nothing

It is also important to note that what you've shown is incomplete code and it only pertains to accessing an existing text index. There is nothing to show if and how you've actually indexed the text in your RDF.

RobV
  • 28,022
  • 11
  • 77
  • 119
  • I dunno! It's the same also if I set `EntityDefinition entDef = new EntityDefinition("http://www.w3.org/1999/02/22-rdf-syntax-ns#resource", "http://www.w3.org/1999/02/22-rdf-syntax-ns#about", RDFS.label.asNode());` – mat_boy Feb 18 '14 at 17:14
  • As I pointed out in my answer your question is incomplete, regardless of the entity definition it is unclear if you've actually created a valid text index and if you've connected it up to your queries dataset. Please provide a complete example if you want further help – RobV Feb 18 '14 at 17:15
  • I understand! I didn't created any index! I simply followed what is reported in the only example provided in the official doc, i.e. file [JenaTextExample1.java](https://svn.apache.org/repos/asf/jena/trunk/jena-text/src/main/java/examples/JenaTextExample1.java). There, no indexing is reported. – mat_boy Feb 18 '14 at 17:41
  • I updated the example such that one can repeat the experiment at home – mat_boy Feb 19 '14 at 09:18