1

From the documentation I understand that Presto does not use Hive execution environment.

Is this the case for other connectors as well like Mysql? What happens when I do a "select sum(col) from mysql_table" through prestodb?

Does prestodb loads the complete table rows into its memory and performs sum or it offloads the entire computation to mysql execution engine?

Thanks!

amrk7
  • 1,194
  • 5
  • 13
  • 33

1 Answers1

3

Did a fast POC on this.

Prestodb seems to delegate the filters to the actual execution engine. But the computations like aggregations and join happen in presto's memory

Please correct if wrong.

amrk7
  • 1,194
  • 5
  • 13
  • 33
  • 1
    Correct. Currently, only simple filters are pushed down to connectors. This will likely change over the next year, but actually implementing better push down is not as easy as you would think. There are subtle semantic differences in SQL implementations. For example, MySQL group together VARCHAR values that differer only in trailing white space, and Presto correctly treats them as different groups. – Dain Sundstrom Mar 21 '15 at 18:17
  • Thanks!. Where can I find the detailed architecture of Presto? Just curious and want to understand at a higher detail about worker/coordinator/discovery nodes. – amrk7 Mar 24 '15 at 04:28
  • I was running some tests and I noted indexes are not used and presto is doing a full table scan. Does anyone knows if it's possible to tell presto to push the filters to MySQL so the indexes are used? – crorella Apr 15 '16 at 17:25