10

Is there a way to filter ElasticSearch documents based on the length of a specific field?

For instance, I have a bunch of documents with the field "body", and I only want to return results where the number of characters in body is > 1000. Is there a way to do this in ES without having to add an extra column with the length in the index?

Henley
  • 21,258
  • 32
  • 119
  • 207

2 Answers2

8

Use the script filter, like this:

"filtered" : {
    "query" : {
        ...
    }, 
    "filter" : {
        "script" : {
            "script" : "doc['body'].length > 1000"
        }
    }
}

EDIT Sorry, meant to reference the query DSL guide on script filters

Phil
  • 2,797
  • 1
  • 24
  • 30
  • 1
    The [documentation for Elasticsearch 2.1](https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-scripting.html) doesn't mention a `.length` field, does this still work? – robinst Jan 12 '16 at 04:38
  • Presumably, if you explicitly enable scripting support this would still work (scripting was disabled by default in v1.4 I believe). Groovy scripting is now used rather than MVEL, so you'll probably want to check into this. – Phil Jan 12 '16 at 16:40
  • http://stackoverflow.com/questions/23023233/elasticsearch-statistical-facet-on-length-of-string-field mentions you can use `"script" : "doc['body'].value.length()"` which worked for me on 1.7.5 – nezda Sep 07 '16 at 16:08
  • 1
    link is bad i think – random-forest-cat Apr 23 '21 at 20:03
0

You can also create a custom tokenizer and use it in a multifields property as in the following:

PUT test_index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "character_analyzer": {
          "type": "custom",
          "tokenizer": "character_tokenizer"
        }
      },
      "tokenizer": {
        "character_tokenizer": {
          "type": "nGram",
          "min_gram": 1,
          "max_gram": 1
        }
      }
    }
  }, 
  "mappings": {
    "person": {
      "properties": {
        "name": { 
          "type": "text",
          "fields": {
            "keyword": { 
              "type": "keyword"
            },
            "words_count": { 
              "type": "token_count",
              "analyzer": "standard"
            },
            "length": { 
              "type": "token_count",
              "analyzer": "character_analyzer"
            }
          }
        }
      }
    }
  }
}

PUT test_index/person/1
{
  "name": "John Smith"
}

PUT test_index/person/2
{
  "name": "Rachel Alice Williams"
}

GET test_index/person/_search
{
  "query": {
    "term": {
      "name.length": 10
    }
  }
}
Mousa
  • 2,926
  • 1
  • 27
  • 35