0

I have records that look something like this:

<?xml version="1.0" encoding="UTF-8"?>
<person>
   <account>
      <domain>ABBVIENET</domain>
      <username>TANGTJ</username>
      <status>ENABLED</status>
   </account>
   <company>AbbVie Inc. (Parent)</company>
   <displayName>Tj Tang</displayName>
   <upi>10025613</upi>
   <firstName>
      <preferred>TJ</preferred>
      <given>Tze-John</given>
   </firstName>
   <middleName/>
   <lastName>
      <preferred>Tang</preferred>
      <given>Tang</given>
   </lastName>
   <secondaryLastName/>
   <address>
      <streetAddress>1 N Waukegan Road</streetAddress>
      <buildingCode>AP52</buildingCode>
      <city>North Chicago</city>
      <state>Illinois</state>
      <country>
         <code>US</code>
         <name>United States</name>
      </country>
   </address>
   <emailAddress>tze-john.tang@abbvie.com</emailAddress>
   <title>Principal Research Scientist</title>
   <managerUpi>10009618</managerUpi>
</person>

When I search using:

search:search("Tang TJ AbbVie")

I get:

<search:snippet>
  <search:match path="fn:doc(&quot;/person/10025613.xml&quot;)/person/company"><search:highlight>AbbVie</search:highlight> Inc. (Parent)
  </search:match>
  <search:match path="fn:doc(&quot;/person/10025613.xml&quot;)/person/displayName">Tj <search:highlight>Tang</search:highlight></search:match>
    <search:match path="fn:doc(&quot;/person/10025613.xml&quot;)/person/firstName">
      <search:highlight>TJ</search:highlight>
  </search:match>
  <search:match path="fn:doc(&quot;/person/10025613.xml&quot;)/person/lastName">
      <search:highlight>Tang</search:highlight>
  </search:match>
</search:snippet>

Where it sort of shows me the element where the match is, i.e. match is in /person/firstName/preferred, and it shows /person/firstName.

If I search for the upi value:

search:search("10025613")

I get:

<search:snippet>
    <search:match path="fn:doc(&quot;/person/10025613.xml&quot;)/person">
      <search:highlight>10025613</search:highlight>
    </search:match>
</search:snippet>

In this case I don't even get a lower level element for the context. How is the element path determined on a snippet? I tried to add an element range index for the upi value, but still ended up with the same result.

Ankit Bhardwaj
  • 754
  • 8
  • 27
TJ Tang
  • 921
  • 6
  • 17
  • I tried this code, but it is working fine. Looks like you have done something wrong in configurations. Can you please try this with a new database? – Ankit Bhardwaj May 03 '16 at 03:26
  • 1
    I also get `` for the UPI query, using MarkLogic 8.0-5. What version are you running? – Dave Cassel May 04 '16 at 12:48
  • The one this is executing on is 8.0-4.2. – TJ Tang May 05 '16 at 12:47
  • I was using 8.0-3. It was working fine for me. Should also work for the version you are using. Have you tried it again on a fresh database? – Ankit Bhardwaj May 05 '16 at 13:47
  • I can confirm that creating the new database and inserting the single record works correctly. I tried re-indexing the original db, but that did not help things. I will load data into the new db, and re-test to see if I get the same behavior after reloading all of the data. – TJ Tang May 05 '16 at 19:10
  • More results from testing. I found that if I load the document via the default REST endpoint, then I get the weird search results, but if then I take the content that is in the db already, and do a document-insert on top of the existing document uri, then the search behaves correctly with the correct paths being returned. Any thoughts on this one? – TJ Tang May 05 '16 at 20:15
  • The other thing is when I do the document-insert, it works if I copy the non-working XML from the query console and put it into the document-insert statement. But if I just do a `let $f := fn:doc("/person/10463871.xml") return xdmp:document-insert("text.xml", $f)`. The same issue is seen in the new document. – TJ Tang May 05 '16 at 20:30

1 Answers1

1

The Search API handles the case where the match is the entire content of an embedded inline element, as in:

<p>Before the <b>match</b> and after</p>

In this case, the Search API will use the text before and after the inline element as the snippet (instead of providing no snippet for the match).

To handle this case correctly, the Search API must distinguish it from the case where the match is on the entire content of a leaf element within a structure.

The upi element above is an example of a leaf element case.

The Search API may have had a bug prior to 8.0-5 that confused the leaf element case for the embedded inline case.

Hoping that clarifies,

ehennum
  • 7,295
  • 13
  • 9