0

I have an elasticsearch query and am looping through a list of coordinates in python and inserting one into the query each loop, and appending the results to a dataframe.

I want a faster way to get the search results for my list of coordinates, like batch processing them all at once.

I have looked into terms query, and this answer: ElasticSearch. How to pass array to the search template

But haven't had success.

This is the original query, where I pass in the location coordinates one coordinate at a time:

{
    "size": 0,
    "_source": false,
    "query": {
        "filtered": {
            "filter": {
                "and": [{
                    "geo_distance": {
                        "distance": "50mi"
                        "location": "35.323312, -23.14848"
                    }
                },
                {
                    "bool": {
                        "must": {
                            "term": {
                                "id_from_store": 99
                            }
                        }
                    }
                },
                {
                    "bool": {
                        "must": {
                            "term": {
                                "is_new": 1
                            }
                        }
                    }
                },
                {
                    "bool": {
                        "must": {
                            "range": {
                                "datetime_shelf": {
                                    "gte": "2018-02-01"
                                }
                            }
                        }
                    }
                }]
            }
        }
    },
    "aggs": {
        "group_by_listing": {
            "terms": {
                "field": "p_id",
                "size": 200 
            }
        }
    }
}

Is there a way to pass in a list of coordinates all at once?

{
    "size": 0,
    "_source": false,
    "query": {
        "filtered": {
            "filter": {
                "and": [{
                    "geo_distance": {
                        "distance": "50mi"
                        "location": ["35.323312, -23.14848", "45.23423,  34.2348", ...]
                    }
                },
                {
                    "bool": {
                        "must": {
                            "term": {
                                "id_from_store": 99
                            }
                        }
                    }
                },
                {
                    "bool": {
                        "must": {
                            "term": {
                                "is_new": 1
                            }
                        }
                    }
                },
                {
                    "bool": {
                        "must": {
                            "range": {
                                "datetime_shelf": {
                                    "gte": "2018-02-01"
                                }
                            }
                        }
                    }
                }]
            }
        }
    },
    "aggs": {
        "group_by_listing": {
            "terms": {
                "field": "p_id",
                "size": 200 
            }
        }
    }
}```

the query returns a dict of p_id with doc_count. Would this be nested for each coordinate? How do I make it return the aggregated doc counts of p_id for each coordinate?
DJ Khaled
  • 17
  • 6

1 Answers1

0

How about leveraging the MultiSearch API and send all queries at once?

GET myindex/_msearch
{}
{"size" : 0, "query": {... "35.323312, -23.14848" ...} }
{}
{"size" : 0, "query": {... "45.23423,  34.2348" ...} }
{}
{"size" : 0, "query": {... "35.23423,  -21.234556" ...} }

That way you can build all your queries at once, send them in one go, have ES execute them all and return all the results in the same order in the response array.

Val
  • 207,596
  • 13
  • 358
  • 360