1

In eXist-db 4.4 I am attempting to implement a basic Lucene query structure, but it is returning no results.

In /db/apps/deheresi/data I have a collection of tei-xml documents which have the same structure, and I want to apply my query only to the text content found within the element tei:seg and its descendants. A typical sample would be:

<TEI>
 <text>
   [...]
    <seg type="dep_event" subtype="event" xml:id="MS609-0001-1">
           <pb n="1r"/>
           <lb break="n" n="1"/>
           <date type="deposition_date" when="1245-05-27" cert="high">Anno
              Domini M° CC° XL° quinto VI Kalendas Iunii.</date>  
           <persName nymRef="#Arnald_Garnier_MSP-AU" role="dep">Arnaldus Garnerii</persName> 
           testis iuratus dixit quod vidit in 
           <placeName type="event_loc" nymRef="#home_of_Cap-de-Porc">domo 
              <persName nymRef="#Peire_Cap-de-Porc_MSP-AU" role="own">Petri de Sancto Andrea</persName>
           </placeName>
           <lb break="y" n="2"/>
           <persName nymRef="#Bernard_Cap-de-Porc_MSP-AU" role="her">B<supplied reason="expname">ernardum</supplied> de Sancto Andrea</persName>, 
           fratrem dicti Petri, et socium eius, hereticos. Et vidit ibi cum eis dictum
           <persName nymRef="#Peire_Cap-de-Porc_MSP-AU" ana="#uAdo" role="par">P<supplied reason="expname">etrum</supplied> de Sancto Andrea</persName> et 
           <persName nymRef="#Susanna_Cap-de-Porc_MSP-AU" ana="#uAdo" role="par">uxor dicti<lb break="y" n="3"/>Petri</persName>. Et 
           <persName nymRef="#Arnald_Garnier_MSP-AU" ana="#pAdo" role="par"/>ipse
           testis adoravit ibi dictos hereticos, sed non vidit alios adorare. Et 
           <date type="event_date" when="1239">sunt VI anni vel circa</date>. 
           <seg type="inq_int" subtype="specific_question">Et quando ipse testis exivit<lb break="y" n="4"/>domum invenit
                 <persName nymRef="#Guilhem_de_Rosengue_MSP-AU" key="inqint" ana="#pIntra" role="ref">Willelmus de Rozergue</persName> intrantem ad dictos hereticos.</seg>
        </seg>
        <seg>
          [...]
        </seg>
    [...]
  <text>
<TEI>

I created and applied a Lucene index as follows (including ignore on certain elements):

<collection xmlns="http://exist-db.org/collection-config/1.0">
  <index xmlns:tei="http://www.tei-c.org/ns/1.0" xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <lucene>
        <analyzer class="org.apache.lucene.analysis.standard.StandardAnalyzer"/>
        <text qname="tei:seg"/>
        <ignore qname="tei:note"/>
        <ignore qname="tei:gap"/>
        <ignore qname="tei:del"/>
        <ignore qname="tei:orig"/>
        <inline qname="tei:supplied"/>
    </lucene>
</index>

I now run my query against a single Latin word found multiple times in every one of my documents in the collection:

let $query := 
   <query>
     <term>vidit</term>
   </query>

return 
    collection('/db/apps/deheresi/data')//tei:seg[ft:query(.,$query)]

And I received the response:

`eXist-db localhost 8081 : Your query returned an empty sequence`

Am I overlooking a piece of the Lucene implementation puzzle?

Many thanks in advance.

jbrehr
  • 775
  • 6
  • 19
  • I initially used the 'index' function in the eXist Java client to reindex the database. I've just gone into eXide and reindexed from there and now the same query suddenly returns results. It seems the index was not compiled even though the java client returned `reindex completed`. – jbrehr Nov 14 '18 at 12:28
  • Where did you store your collection.xconf file? – Joe Wicentowski Nov 14 '18 at 12:37
  • It's in `/db/apps/deheresi`. It now works since I reindexed from eXide. – jbrehr Nov 14 '18 at 12:38

1 Answers1

1

When working with eXist indexes, keep in mind that you must store the collection configuration file in a subcollection of /db/system/config/ mirroring the location of the data. So if your data is in /db/apps/deheresi, you must store your collection configuration file as /db/system/config/db/apps/deheresi/collection.xconf.

eXide has an extremely convenient feature, which detects when you store a collection configuration file in the database, offers to store a copy of the file in the corresponding location within the /db/system/config subcollection, and reindexes the source collection after the copy is stored.

However, when working outside eXide, keep in mind that edits to, say, /db/apps/deheresi/collection.xconf must be manually copied to the /db/system/config collection, and the source collection must be manually reindexed—in order for the new configuration to be active.

Joe Wicentowski
  • 5,159
  • 16
  • 26