0

Hi I would like to index objects that look like that

{
   uuid: "123",
   clauses: [{ order: 1, uuid: "345"},{ order: 2, uuid: "567"},{ order: 3, uuid: "789"}]

}

Is there a way to write a query that matches all the objects that contain clauses with uuid: "345" and uuid: "789" but order of the second one is at most two bigger than first one?

So the above example would match but the next one wouldn't :

 {
   uuid: "999",
   clauses: [{ order: 1, uuid: "345"},{ order: 2, uuid: "567"},{order: 3, uuid: "777"},{ order: 4, uuid: "789"}]

}

The reason is that order of "789" clause is 4 which is more than 2 bigger than "345" clause, which has order 1.

Any help is appreciated! Thanks, Michail

Michail Michailidis
  • 11,792
  • 6
  • 63
  • 106

1 Answers1

1

One way to achieve this involves using a script filter.

The script I'm using is the following:

def idxs = []; 
for (int i = 0; i < doc['clauses.uuid'].values.size(); i++) {
    if (matches.contains(doc['clauses.uuid'].values[i])){
        idxs << i
    }
};
def orders = idxs.collect{ doc['clauses.order'].values[it]}; 
return orders[1] - orders[0] <= 2

Basically, what I'm doing is first collection all indices of the clauses which contain a uuid in the matches array (i.e. 345 and 789). Then, with the indices I got I gather all order values at those indices. And finally, I check that the second order minus the first order is not bigger than 2.

POST your_index/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "clauses.uuid": "345"
          }
        },
        {
          "term": {
            "clauses.uuid": "789"
          }
        },
        {
          "script": {
            "script": "def idxs = []; for (int i = 0; i < doc['clauses.uuid'].values.size(); i++) {if (matches.contains(doc['clauses.uuid'].values[i])){idxs << i}}; def orders = idxs.collect{doc['clauses.order'].values[it]}; return orders[1] - orders[0] <= 2",
            "params": {
              "matches": [
                "345",
                "789"
              ]
            }
          }
        }
      ]
    }
  }
}

That will return only the first document and not the second.

Val
  • 207,596
  • 13
  • 358
  • 360
  • Thanks - I will try it! Do you think this could be very cpu intensive ?- I had in mind to actually precalculate the forwardClauses of each clause and then use nested filtering .. I wonder which approach would be faster during retrieval. – Michail Michailidis Mar 21 '16 at 17:51
  • 1
    Clearly, whatever you can compute at indexing time, do it, it'll be much faster at query time. – Val Mar 21 '16 at 17:55