I faced a weird with elasticsearch python drivers and would like if someone can explain it to me! The below code works directly from cURL
but doesn't work with python-requests
or elasticsearch-py
, strangely, it works when I switch to pyelasticsearch
library! The details are:
I have a type called MY_TYPE
that has a nested object MY_NESTED_FIELD
and a child document MY_CHILD_TYPE
. I'm trying to do term facet aggregation on the nested attributes based on filters applied to the MY_TYPE
and MY_CHILD_TYPE
types. The query looks like
{
"query": {
"filtered": {
"filter": {
"has_child": {
"query": {
"range": {
"CHILD_FIELD": {
"gte": 0.5
}
}
},
"type": "MY_CHILD_TYPE"
}
}
}
},
"aggs": {
"aggregation_results": {
"aggs": {
"boards": {
"terms": {
"field": "MY_NESTED_FIELD.KEY",
"size": 100
},
"aggs": {
"MY_RANGES": {
"range": {
"ranges": [
{
"to": 0.5,
"from": 0
},
{
"to": 0.8
"from": 0.5
}
],
"field": "MY_NESTED_FIELD_PATH.VALUE"
}
}
}
}
},
"nested": {
"path": "MY_NESTED_FIELD_PATH"
}
}
}
}
When I run this query against elasticsearch directly (using cURL
or head
plugin) it filters the parent and returns aggregations based on correct results. However, when I try it from the python script, it runs successfully but returns wrong data (it returns facets from all the documents without applying the filter)
I have tried:
- cURL: Works!
- ElasticSearch's HEAD plugin: Works!
- python-requests version 2.8.1: Did not work!
- elasticsearch-py api versions 1.4.0 and 2.1.0: Did not work!
- pyelasticsearch version 1.4: Works!
The code snippets for elasticsearch-py is:
from elasticsearch import Elasticsearch
es = Elasticsearch('HOST:PORT')
data = es.search(index='INDEX_NAME', doc_type='MY_TYPE', body=payload, q='*:*', size=0)
When using python-requests, the code was:
import requests
url = 'http://ES_HOST:ES_PORT/ES_INDEX/ES_TYPE/_search'
params = {'size':0, 'q':'*:*'}
data = requests.post(url, params=params, data=json.dumps(payload)).json()
My elastic search version is:
{
"version": {
"number": "1.4.4",
"build_hash": "c88f77ffc81301dfa9dfd81ca2232f09588bd512",
"build_timestamp": "2015-02-19T13:05:36Z",
"build_snapshot": false,
"lucene_version": "4.10.3"
}
}
So my questions are:
- Is this the best way to write this query?
- Is there an explanation for why elasticsearch-py is acting strangely?
- Is there a fix for this on elasticsearch-py?