73

I am trying to find ElasticSearch query equivalent to IN \ NOT in SQL.

I know we can use QueryString query with multiple OR to get the same answer, but that ends up with lot of OR's.

Can anyone share the example?

pickypg
  • 22,034
  • 5
  • 72
  • 84
Sameer Deshmukh
  • 1,339
  • 4
  • 13
  • 20

3 Answers3

110

Similar to what Chris suggested as a comment, the analogous replacement for IN is the terms filter (queries imply scoring, which may improve the returned order).

SELECT * FROM table WHERE id IN (1, 2, 3);

The equivalent Elasticsearch 1.x filter would be:

{
  "query" : {
    "filtered" : {
      "filter" : {
        "terms" : {
          "id" : [1, 2, 3]
        }
      }
    }
  }
}

The equivalent Elasticsearch 2.x+ filter would be:

{
  "query" : {
    "bool" : {
      "filter" : {
        "terms" : {
          "id" : [1, 2, 3]
        }
      }
    }
  }
}

The important takeaway is that the terms filter (and query for that matter) work on exact matches. It is implicitly an or operation, similar to IN.

If you wanted to invert it, you could use the not filter, but I would suggest using the slightly more verbose bool/must_not filter (to get in the habit of also using bool/must and bool).

{
  "query" : {
    "bool" : {
      "must_not" : {
        "terms" : {
          "id" : [1, 2, 3]
        }
      }
    }
  }
}

Overall, the bool compound query syntax is one of the most important filters in Elasticsearch, as are the term (singular) and terms filters (plural, as shown).

pickypg
  • 22,034
  • 5
  • 72
  • 84
  • 1
    what will be the time complexity for above query if array size is 800 or more – dj Bravo Jul 25 '19 at 08:03
  • what is the limitations here. any maximum array size? – Oshan Wisumperuma Dec 06 '19 at 12:38
  • Is it possible to provide another query instead of string array? – Hardik Patel Sep 01 '22 at 13:53
  • Ultimately, the complexity of providing an array of size N means iterating over N terms and including / excluding the docs with those terms in the inverted index. In terms of the other query, you probably are looking for `bool`/`should` (and `"minimum_should_match": 1`) to get `OR` behavior of separate queries. – pickypg Apr 20 '23 at 18:08
5

1 terms

you can use terms term query in ElasticSearch that will act as IN

terms query is used to check if the value matches any of the provided values from Array.

2 must_not

must_not can be used as NOT in ElasticSearch.

ex.

GET my_index/my_type/_search
{
    "query" : {
         "bool" : {
              "must":[
                {
                   "terms": {
                        "id" : ["1234","12345","123456"]
                    }
                },
                {
                   "bool" : {
                        "must_not" : [
                            {
                              "match":{
                                  "id" : "123"
                               }
                            }
                        ]
                    }
                }
              ]
         }
    }
}
  1. exists

Also if it helps you can also use "exists" query to check if the field exists or not. for ex, check if the field exists

"exists" : {
      "field" : "mobileNumber"
   }

check if a field does not exist

"bool":{
    "must_not" : [
        {
           "exists" : {
               "field" : "mobileNumber"
           }
        }
     ]
}
niranjan_harpale
  • 2,048
  • 1
  • 17
  • 21
-2

I saw what you requested. And I wrote the source code as below.

I hope this helps you solve your problem.

sql query :

select * from tablename where fieldname in ('AA','BB');

elastic search :

{
    query :{
        bool:{
            must:[{
              "script": {
                "script":{
                  "inline": "(doc['fieldname'].value.toString().substring(0,2).toUpperCase() in ['AA','BB']) == true"
                }
              }
            }],
            should:[],
            must_not:[]
        }
    }
}
sunnier
  • 9
  • 1
  • First: there are easier ways to do a "startsWith" style string in ES and Java/Painless for that matter. Second: this is not equivalent to the SQL query. My original answer is still correct here. – pickypg Nov 06 '18 at 20:57