0

Presto's documentation says max_by(x, y) returns the value of x associated with the maximum value of y over all input values. (https://prestodb.github.io/docs/current/functions/aggregate.html#max_by)

How can I return multiple columns (not only x) associated with the maximum y? I found that max_by((x1, x2, x3), y) works (almost) but it returns a single column with x1, x2, x3 and I don't know how to convert it into multiple columns.

yewang
  • 585
  • 7
  • 14

2 Answers2

4

Starting with Presto 314, it is now possible to reference ROW fields using the [] operator.

SELECT r[1], r[2], r[3]
FROM (
   SELECT max_by((x1, x2, x3), y) r
   FROM (...) t(y, x1, x2, x3)
)
Martin Traverso
  • 4,731
  • 15
  • 24
3

Since Presto 314

As Martin Traverso pointed out, since Presto 314 there is a better option, the [] subscript operator. See Martin's answer.

For older versions, see below

(x1, x2, x3) creates a row with anonymous fields. Currently, to access individual row fields, you need to cast the value to a row with named fields:

CAST(row_value AS row(some_field field_type, other_field, field_type, ...))

In a query this can be inside max_by or outside (doesn't matter). Example:

presto> SELECT r.afield, r.bfield, r.cfield
     -> FROM (
     ->     SELECT max_by(CAST((x1, x2, x3) AS row(afield integer, bfield varchar, cfield double)), y) r
     ->     FROM (VALUES (1, 42, 'a', 13e0), (2, 40, 'b', 77e0)) t(y, x1, x2, x3)
     -> );
 afield | bfield | cfield
--------+--------+--------
     40 | b      |   77.0
(1 row)

I understand this is quite verbose. There is an issue to make this more convenient: https://github.com/prestosql/presto/issues/860.

Piotr Findeisen
  • 19,480
  • 2
  • 52
  • 82