I have an array in Presto and I'd like to count how many times each element occurs in it. For example, I have
[a, a, a, b, b]
and I'd like to get something like
{a: 3, b: 2}
I have an array in Presto and I'd like to count how many times each element occurs in it. For example, I have
[a, a, a, b, b]
and I'd like to get something like
{a: 3, b: 2}
We do not have a direct function for this, but you can combine UNNEST
with histogram
:
presto> SELECT histogram(x)
-> FROM UNNEST(ARRAY[1111, 1111, 22, 22, 1111]) t(x);
_col0
----------------
{22=2, 1111=3}
You may want to file a new issue for a direct function for this.
SELECT
TRANSFORM_VALUES(
MULTIMAP_FROM_ENTRIES(
TRANSFORM(ARRAY['a', 'a', 'a', 'b', 'b'], x -> ROW(x, 1))
),
(k, v) -> ARRAY_SUM(v)
)
Output: { "a": 3, "b": 2 }
You can use REDUCE if there is no support of ARRAY_SUM:
SELECT
TRANSFORM_VALUES(
MULTIMAP_FROM_ENTRIES(
TRANSFORM(ARRAY['a', 'a', 'a', 'b', 'b'], x -> ROW(x, 1))
),
(k, v) -> REDUCE(v, 0, (s, x) -> s + x, s -> s)
)
In Presto 0.279
, you now have a direct function for this purpose. You can easily use array_frequency. The input is your ARRAY, and the output is a MAP, where keys are the element of the given array and values are the frequencies. Fro example, if you run this SQL :
SELECT array_frequency(ARRAY[1,4,1,3,5,4,7,3,1])
The result will be
{
"1": 3,
"3": 2,
"4": 2,
"5": 1,
"7": 1
}