3

I am able to query an index in elasticsearch. And, now I want to narrow down the data to some specific fields. But, I am keep getting errors.

Here is my query:

es = Elasticsearch(hosts="myhost", "port":0000)


search_body={
    "bool":{
            "filter":[
                {"exists": {"field": "customer_name"}},
                {"match_phrase": {"city": "chicago"}},
                ]
        }

    }

results = es.search(index="some_index", query=search_body)

I am easily able to get results upto this point. But, since the returned has so many fields, I want to retrieve only specific fields before converting it into a dataframe. I can convert it into a dataframe and then filter, but that is not optimal.


I tried adding _source and field methods as:

search_body={
    "bool":{
            "filter":[
                {"exists": {"field": "customer_name"}},
                {"match_phrase": {"city": "chicago"}},
                ]
        },
    "_source":{"fields": {"includes":["customer_name", "city", "company", "company_address"] }}
    }

and other variants like,

"fields": {"includes":["customer_name", "city", "company", "company_address"] }

# or 

"_source":{"includes":["customer_name", "city", "company", "company_address"] }

# and several others.

I keep getting error:

    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(
elasticsearch.exceptions.RequestError: RequestError(400, 'parsing_exception', '[bool] malformed query, expected [END_OBJECT] but found [FIELD_NAME]')

I followed:

What am I missing here?

everestial007
  • 6,665
  • 7
  • 32
  • 72

3 Answers3

0

It looks like from the JSON you provided your _source field is landing inside your bool query. Instead you should have it structured like so:

search_body = {
  "query": {
    "bool": [
      {...}
    ]
   },
   "_source": {
     "includes": [...]
   }
}

(disclaimer: I'm the maintainer of the Python Elasticsearch client and work for Elastic)

sethmlarson
  • 923
  • 8
  • 21
  • I did try that, and it does not work. Whenever I put `"query":{"bool": .... }` I get query malformed error. If it was Kibana search it could work but it's not work for pyelasticsearch client. Since "search_body" itself is a query, extra "query" parameter does not matter. I would assume so because it has never worked for me. – everestial007 Jan 11 '22 at 18:19
0

Try this:

results = es.search(index="some_index", query=search_body, source_includes=[...])

Code is the best documentation (sometimes!)

ilvar
  • 5,718
  • 1
  • 20
  • 17
0

The main issue is with passing the "search_body" parameters as body or query.

If my "search_body" is as given below, I cannot pass it as query because query is meant to be a specific "query" I request on the indexes. Requesting _source on this query malforms the request.

search_body={
    "bool":{
            "filter":[
                {"exists": {"field": "customer_name"}},
                {"match_phrase": {"city": "chicago"}},
                ]
        },
    "_source":{"fields": {"includes":["customer_name", "city", "company", "company_address"] }}
    }

This will pass because the request is actually passed as body, which contains the "query" and another "_source" field to subset the data.

es = Elasticsearch(hosts="myhost", "port":0000)

results = es.search(index="some_index", body=search_body)

This will fail because I have requested the search as query and again asking for subsetting the data.

es = Elasticsearch(hosts="myhost", "port":0000)

results = es.search(index="some_index", query=search_body)

This second request will pass if our search_body is as:

search_body={
    "bool":{
            "filter":[
                {"exists": {"field": "customer_name"}},
                {"match_phrase": {"city": "chicago"}},
                ]
        }
    }

but for naming convention the key should be named "query_body".

query_body={
    "bool":{
            "filter":[
                {"exists": {"field": "customer_name"}},
                {"match_phrase": {"city": "chicago"}},
                ]
        }
    }

and requested as:

es = Elasticsearch(hosts="myhost", "port":0000)

results = es.search(index="some_index", query=query_body)

So, it is to be understood that query and body are two different ways of requesting data on a index.

Note: Python elasticsearch client may be soon deprecating the body argument in its request. In that case let's see how we can subset the filtered/queried data.

Hope it helps others.

everestial007
  • 6,665
  • 7
  • 32
  • 72