1

Earlier we used the FLOWR query to satisfy our search requirement, since data is getting increased day by day so we decided to use Indexing for better search performance.

Working FLOWR Query (Just Sample)

for $doc in collection("col1")
where fn:contains($doc//entityName/text(), "USA")
return document-uri($doc)

above query is working and it returns a document URI, Now we are trying to use Optic API to satisfy the same requirement.

We have created an element range index for entityName but not sure how to convert the above FLOWR query into Optic Query.

What will be equivalent Optic Query for the above FLOWR query ?, also in future we are planning to use fn:starts-with() and fn:ends-with() functions too.

We are using MarkLogic 10.0-2.1

Any help is appreciated

DevNinja
  • 1,459
  • 7
  • 10

1 Answers1

2

After creating a TDE to project the entity properties, the equivalent Optic query would resemble the following in XQuery:

op:from-view(null, VIEW_NAME, '', op:fragment-id-col('docId'))
=> op:where(ofn:contains(op:col('entityName', 'USA'))
=> op:where(cts:collection-query(COLLECTION_NAME))
=> op:join-doc-uri('uri', op:fragment-id-col('docId'))
=> op:select('uri')
=> op:result()

In XQuery, the ofn library must be imported.

In SJS, the op.fn field provides the equivalent functions:

op.fromView(null, VIEW_NAME, '', op.fragmentIdCol('docId'))
  .where(op.fn.contains(op.col('entityName', 'USA'))
  .where(cts.collectionQuery(COLLECTION_NAME))
  .joinDocUri('uri', op.fragmentIdCol('docId'))
  .select('uri')
  .result()

The operations used:

  1. fromView() accesses the entity view
  2. The first where() filters on the value of the column during query execution
  3. The second where() constrains the entity rows to matching source documents
  4. The joinDocUri() joins the URI lexicon based on the source documents of the entity rows
  5. The select() projects the 'uri' column, ignoring the unneeded view columns.

joinDocUri() is a convenience for

.joinInner(
    op.fromLexicons({'uri':cts.uriReference()}, '', op.fragmentIdCol('uriDocId')),
    op.on(op.fragmentIdCol('docId'), op.fragmentIdCol('uriDocId'))
    )

The Optic expression functions also include op.fn.startsWith() and op.fn.endsWith(). In general, Optic expressions can use a function if it both

  • is a builtin - in other words, doesn't require an import or require
  • only transforms its input to its output - in other words, is purely functional without side effects or environment sensitivity

See also this list of expression functions:

https://docs.marklogic.com/guide/app-dev/OpticAPI#id_69308

Hoping that helps,

ehennum
  • 7,295
  • 13
  • 9
  • Thank you @ehennum, Actually, we are looking solution using op:from-lexicons, could you please help us by modifying the above answer. – DevNinja Jan 27 '21 at 17:46
  • 1
    You can always use `op:join-doc() ` - which takes the same arguments as `op:join-doc-uri()` - and then use an `op:xpath()` to extract the entity name. But the strong recommendation is to project the entity name with a TDE. Accessing the value in an index is much faster at scale than reading every document in a collection. – ehennum Jan 27 '21 at 17:57
  • Thank you @ehennum, we have created an element range index for the element "entityName" – DevNinja Jan 27 '21 at 18:01
  • 1
    Good that you have a working solution. Adding a footnote to consider going forward -- because range indexes are memory mapped, free memory eventually limits how many range indexes can be created. Because TDE indexes are maintained on disk and cached in memory, any number of columns can be indexed. A view can have a single column if that's all that's needed. – ehennum Jan 28 '21 at 16:58