0

I am running a SQL query in starburst-presto. It's connected to elasticsearch using the relevant connector.

The SQL has an "order by" clause. This clause is not pushing down to elasticsearch. Basically, I want to sort the data in elasticsearch based on a specific field and return the result. The query with "order by" is taking a lot of time using presto. Is it possible to manage is somehow to get an optimal performance?

SQL: select e.employee_id from elasticsearch.es."employee:id:""2390571"" && (doj_timestamp:(>=15965454 && <=15972366)) sort=employee_id:desc" e offset 0 limit 5;

The above query is returning random results.

Can anyone please help here?

RoyalTiger
  • 511
  • 2
  • 8
  • 27

1 Answers1

3

Your query has both ORDER BY and LIMIT, so in Presto it is called a Top N query. Presto currently does not provide Top N pushdown, but this feature is in the works.

Please file an issue for Elasticsearch connector TopN pushdown. We will implement it anyway, but direct user feedback helps understand issue priorities.

You can learn more on the #pushdown channel on Presto community slack.

Piotr Findeisen
  • 19,480
  • 2
  • 52
  • 82
  • Thanks @Piotr for the information. I was struggling to get this in website, finally I downloaded the presto github code and found the same. I have added myself in the channel and logged my issue there. Do you want me to file the issue anywhere else? – RoyalTiger Aug 13 '20 at 04:21
  • 1
    For posterity, the issue link - https://github.com/prestosql/presto/issues/4803 – Piotr Findeisen Aug 13 '20 at 10:56