5

I have an array in Presto and I'd like to count how many times each element occurs in it. For example, I have

[a, a, a, b, b]

and I'd like to get something like

{a: 3, b: 2}
Piotr Findeisen
  • 19,480
  • 2
  • 52
  • 82
Max Mir
  • 59
  • 1
  • 3

4 Answers4

5

We do not have a direct function for this, but you can combine UNNEST with histogram:

presto> SELECT histogram(x)
     -> FROM UNNEST(ARRAY[1111, 1111, 22, 22, 1111]) t(x);
     _col0
----------------
 {22=2, 1111=3}

You may want to file a new issue for a direct function for this.

Piotr Findeisen
  • 19,480
  • 2
  • 52
  • 82
4
SELECT
    TRANSFORM_VALUES(
        MULTIMAP_FROM_ENTRIES(
            TRANSFORM(ARRAY['a', 'a', 'a', 'b', 'b'], x -> ROW(x, 1))
        ),
        (k, v) -> ARRAY_SUM(v)
    )

Output: { "a": 3, "b": 2 }

Nitin Mathur
  • 185
  • 1
  • 3
0

You can use REDUCE if there is no support of ARRAY_SUM:

SELECT
    TRANSFORM_VALUES(
        MULTIMAP_FROM_ENTRIES(
            TRANSFORM(ARRAY['a', 'a', 'a', 'b', 'b'], x -> ROW(x, 1))
        ),
        (k, v) -> REDUCE(v, 0, (s, x) -> s + x, s -> s)
    )
Gennady
  • 31
  • 1
  • 3
0

In Presto 0.279, you now have a direct function for this purpose. You can easily use array_frequency. The input is your ARRAY, and the output is a MAP, where keys are the element of the given array and values are the frequencies. Fro example, if you run this SQL :

SELECT array_frequency(ARRAY[1,4,1,3,5,4,7,3,1])

The result will be

{
  "1": 3,
  "3": 2,
  "4": 2,
  "5": 1,
  "7": 1
}
alift
  • 1,855
  • 2
  • 13
  • 28