Hyperunique Aggregations in Calcite-Druid Adapter

Question

In my Druid data source, I have a hyperUnique aggregation (ingestion time) on one of the fields.

I am trying to do the equivalent of COUNT(DISTINCT(<hyperunique_field>)) on this aggregated field.

Is it supported in the Calcite Druid Adapter? If so, what is the correct way to go about it?

In plywood, I can do COUNT_DISTINCT. Running this returns 0 counts.

SQL:

select floor("__time" to HOUR) time_bucket,”field_1", count(distinct(“ingestion_time_aggregated_field")) as uniq from “datasource" where "__time" between '2017-01-01 00:00:00' and '2017-01-02 00:00:00' and “field_1" in (‘value_1') and “field_2”='value_2' and “field_3”='value_3' and “field_4”='value_4' group by floor("__time" to HOUR),”field_1" order by floor("__time" to HOUR);

ingestion_time_aggregated_field:

{"name": "ingestion_time_aggregated_field", "type": "hyperUnique","fieldName": “field” }

score 0 · Accepted Answer · answered Feb 28 '17 at 19:28

0

Complex aggregators are not supported by the calcite-druid adapted. The reason is that HLL is an approximate and not exact so it does not actually answer to the query of unique count.

answered Feb 28 '17 at 19:28

Slim Bouguerra

359
1
8

Hyperunique Aggregations in Calcite-Druid Adapter

1 Answers1