1

So, I faced an interview recently with a well known company on Marklogic. He has asked me a question which I couldn't answer. There is an XML example data as below shown.

He asked me how can you get only employee id whose zipcode is 12345 and state is california using search? like cts:search

The thing which came into my mind is write XPath like below but since he asked me using search I couldn't answer

let $x :=//employee/officeAddress[zipCode="38023"]/../employeeId/string()
return $x

xml dataset:

<employees>
  <employee>
    <employeeId>30004</employeeId>
    <firstName>crazy</firstName>
    <lastName>carol</lastName>
    <designation>Director</designation>
    <homeAddress>
      <address>900 clean ln</address>
      <street>quarky st</street>
      <city>San Jose</city>
      <state>California</state>
      <zipCode>22222</zipCode>
    </homeAddress>
    <officeAddress>
      <address>000 washington ave</address>
      <street>bonaza st</street>
      <city>San Francisco</city>
      <state>California</state>
      <zipCode>12345</zipCode>
    </officeAddress>
  <employee>
</employees>
Mads Hansen
  • 63,927
  • 12
  • 112
  • 147
user2708013
  • 399
  • 2
  • 11

2 Answers2

3

Using XPath is a natural initial thought for many familiar with XML technologies and starting with MarkLogic. It was what I first started to do when I was just starting out.

Some XPath expressions can be optimized by the database and perform fast and efficiently, but there are also others that cannot and may not perform well.

Using cts:search and the built-in query constructs allows for optimized expressions that will leverage indexes, and allows you to further tune by analyzing xdmp:plan, xdmp:query-meters, and xdmp:query-trace.

An equivalent cts:search expression for the XPath, specifying the path to /employees/employee in the first $path parameter and combining cts:element-value-query with cts:and-query in the second $query parameter would be:

cts:search(/employees/employee, 
  cts:and-query(( 
    cts:element-value-query(xs:QName("zipCode"), "12345"), 
    cts:element-value-query(xs:QName("state"), "California") )))/employeeId

You could also use a more generic $path to search against all documents and use an xdmp:element-query() to surround the cts:element-value-query criteria to restrict the search to descendants of the employee element and then XPath into the resulting document(s):

cts:search(doc(), 
  cts:element-query(xs:QName("employee"), 
    cts:and-query(( 
      cts:element-value-query(xs:QName("zipCode"), "12345"), 
      cts:element-value-query(xs:QName("state"), "California") ))
  )
)/employees/employee/employeeId
Mads Hansen
  • 63,927
  • 12
  • 112
  • 147
0

xpath I would have tried (not tested):

/employees/employee[officeAddress/zipCode = '38023' and officeAddress/state = 'California']/employeeId/string()

Note that you can use xdmp:plan on xpath too; it's interesting to see how it works vs cts:search.

In general you're better off putting as much into cts:search as possible vs xpath (and I like xpath!).

The question is a little ambiguous. Are there many employees in one document? Or many employees documents? Both?

Also, don't forget to add the appropriate position indexes, or you won't get much unfiltered help. Look at the plan before and after adding the indexes.

See also https://help.marklogic.com/Knowledgebase/Article/View/queries-constrained-to-elements

asusu
  • 321
  • 1
  • 5