Using analytical Count(distinct) on Vertica is not supported

Question

Having a thorough Google research, it seems that Vertica DB simply does not support count(distinct <col>) over(<partition by>), as it causes:

"ERROR 4249: Only MIN/MAX are allowed to use DISTINCT ... MIN/MAX are allowed to use DISTINCT"

I'm looking for an easy walk-around for this one.

Meanwhile, I'm using joins or nested queries.

For example:

select campaign_id, segment_id, COUNT(DECODE(rank, 1, 1, NULL)) over()
from (select campaign_id, segment_id, row_number() over(partition by segment_id) rank
 from cs)

But my query is very long and I need to invent tricks all over the way. Any idea for a better approach?

Thanks!

(Working at HPE? Please implement this, as you did for all common analytical funcitions!)

Your solution is fine. I mean, I wouldn't use `decode` but it is a very reasonable solution to the problem. — Gordon Linoff, Dec 28 '16 at 21:09

score 0 · Answer 1 · answered Dec 29 '16 at 17:19

I had to do something similar nested counting structure for counting distinct values cumulatively, over a date range. It boiled down to a similar gathering up of ROW_NUMBER() = 1 rows, though I used case:

COUNT(CASE WHEN rank = 1 THEN userID END) OVER (...)

It wasn't pretty to look at, but it was mercifully not slow.

I need to invent tricks all over the way

Yeah, I think that just happens when you bump into missing features.

Using analytical Count(distinct) on Vertica is not supported

1 Answers1