2

I have the following structure of objects in ElasticSearch:

{
  _id: 1,
  myObj: {
    myCol: [{id: 1, name:"1"}, {id: 2, name:"2"}, {id: 3, name:"3"}]
  }
},
{
  _id: 2,
  myObj: {
    myCol: [{id: 2, name:"2"}, {id: 3, name:"3"}, {id: 4, name:"4"}]
  }
},

I'm using C# NEST library to create queries. I want to search myCol collection of objects, using collection of identifiers.

Example #1: Search request: identifiers [2, 3] Result: Both objects are returned

Example #2: Search request: identifier: [1] Result: First object is returned

Example #3: Search request: identifier: [1, 2, 3, 4] Result: No objects are returned


What i'm actullay trying to do is a query "Contains all".

Please note:

  1. C# NEST MultiMatchQuery type does not support integer arrays (only strings. So bad). So please don't offer me to use this type of query
  2. I'm using Object Initializer query syntax
  3. A correct query in ElasticSearch syntax would be enough.
Ilya Schukin
  • 303
  • 7
  • 11
  • If I understand you right, you shouldn't need anything special here - just a bool query with multiple `Must` clauses on `myObj.myCol.id` - collections in lucene are flattened out so what you actually have is effectively duplicate keys such that `myObj.myCol.id = 1 && myObj.myCol.id = 2 ...` – Ant P Sep 29 '16 at 09:48
  • Can you create an example of this query in Elastic please? – Ilya Schukin Sep 29 '16 at 09:50

1 Answers1

4

What you want is to fetch documents that contain all of the specified IDs somewhere in the collection.

When you use collections of objects in ElasticSearch, they are flattened out so what you actually index is something like the following.

myObj.myCol.id = [ 2, 3, 4 ]
myObj.myCol.name = [ "2", "3", "4" ]

In many cases this is problematic because you lose track of which pairs of ID/Name go together (so you can't, for example, query for a document containing an object with ID x and name y - it will produce false positives if the collection contains x and y in different objects).

However, in your case, it's actually beneficial, because you can just query for documents that contain all of the IDs in myObj.myCol.id, e.g.:

{
  "query": {
    "bool": {
      "must": [
        { "match": { "myObj.myCol.id": 1 }},
        { "match": { "myObj.myCol.id": 2 }}
      ]
    }
  }
}

This will only return documents where myObj.myCol contains objects with IDs of both 1 and 2.

More information on how collections work in ES can be found here.

Ant P
  • 24,820
  • 5
  • 68
  • 105
  • Unfortunately this doesn't work. The query executes successfully with one identifier specified. But when I add another one - no results are returned – Ilya Schukin Sep 29 '16 at 10:09
  • @IlyaSchukin strange - if I index the documents you posted and use this query, it works as expected. Could some detail you haven't posted be affecting the results of the query? Maybe try stripping it back to what's actually in this Q&A and building back up from there. – Ant P Sep 29 '16 at 10:17
  • You were right. Very close tough. I had to create some "nested" queries and but the bool query inside each of them – Ilya Schukin Sep 30 '16 at 07:59
  • @IlyaSchukin - Ah, you have nested objects in your mappings? – Ant P Sep 30 '16 at 08:03