9

My Elasticsearch (v5.4.1) documents have a _patents field as such :

{
    // (Other fields : title, text, date, etc.)
    ,
    "_patents": [
        {"cc": "US"},
        {"cc": "MX"},
        {"cc": "KR"},
        {"cc": "JP"},
        {"cc": "CN"},
        {"cc": "CA"},
        {"cc": "AU"},
        {"cc": "AR"}
    ]
}

I'm trying to build a query that would return only documents whose patents match an array of country codes. For instance, if my filter is ["US","AU"] I need to be returned all documents that have patents in US and in AU. Exclude documents that have US but not AU.

So far I have tried to add a "term" field to my current working query :

{
    "query": {
        "bool": {
            "must": [
                // (Other conditions here : title match, text match, date range, etc.) These work
                 ,
                {
                    "terms": {
                        "_patents.cc": [ // I tried just "_patents"
                            "US",
                            "AU"
                        ]
                    }
                }
            ]
        }
    }
}

Or this, as a filter :

{
    "query": {
        "bool": {
            "must": [...],
            "filter": {
                "terms": {
                    "_patents": [
                        "US",
                        "AU"
                    ]
                }
            }
        }
    }
}

These queries and the variants I've tried don't produce an error, but return 0 result.

I can't change my ES document model to something easier to match, like "_patents": [ "US","CA", "AU", "CN", "JP" ] because this is a populated field. At indexation time, I populate and reference Patent documents that have many fields, including cc.

Jeremy Thille
  • 26,047
  • 12
  • 43
  • 63

3 Answers3

11

I found the solution. The filtered country names have to be lowercase...

"US" returns no result, but "us" works, despite the indexed field being "US" ...... Faint -_-'

I also wrote the query this way :

{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "_patents.cc": "us"
          }
        },
        {
          "term": {
            "_patents.cc": "ca"
          }
        }
      ]
    }
  }
}  
Jeremy Thille
  • 26,047
  • 12
  • 43
  • 63
  • I couldn't figure out why querying against an array of ints was working fine but with an array of strings it returned 0 results. This seems to be true when using "term/terms" but not when using "query." I guess it makes sense to facilitate exact matches but why not transform the query then? I'm missing something, obviously. – regularmike Oct 23 '19 at 19:20
8

This works for Uppercase and lowercase both..

 {
  "query": {
    "bool": {
      "must": [ 
        {
          "match": {
            "_patents.cc": "au"
          }
        },
        {
          "match": {
            "_patents.cc": "us"
          }
        }
      ]
    }
  }
}
Rishi Pandey
  • 154
  • 11
  • 1
    Cool, that's right, thanks :) I didn't know that "term" worked only with lowercase. – Jeremy Thille Jun 16 '17 at 09:54
  • This worked for me, thanks. Do you know if there's a "cleaner" way of doing it where we don't have to repeat the "match" clause ? – tomfl Sep 02 '20 at 20:37
5

My version of elasticsearch Version is 6.0.1. I am using this approach:

GET <your index>/_search
{
  "query": {
    "bool": {
      "must": [{
        "query_string": {
          "query": "cc:us OR cc:ca"
        }
      }]
    }    
  }
}
1nstinct
  • 1,745
  • 1
  • 26
  • 30