Why we cannot use boolean values in aggregate functions without casting to some integer type first? In many cases it makes perfect sense to calculate sum, average or correlation from columns of boolean data type.
Consider the following example where boolean input has to be always casted to int
in order to make it work:
select
sum(boolinput::int),
avg(boolinput::int),
max(boolinput::int),
min(boolinput::int),
stddev(boolinput::int),
corr(boolinput::int,boolinputb::int)
from
(select
(random() > .5)::boolean as boolinput,
(random() > .5)::boolean as boolinputB
from
generate_series(1,100)
) a
From PostgreSQL documentation:
Valid literal values for the "true" state are: TRUE 't' 'true' 'y' 'yes' 'on' '1'
For the "false" state, the following values can be used: FALSE 'f' 'false' 'n' 'no' 'off' '0'
Because by definition TRUE
equals 1
and FALSE
equals 0
I do not understand why casting is necessary.
Allowing boolean in aggregation would have also interesting side effects - we can for example simplify many case statements:
Current version (clean and easy to understand):
select sum(case when gs > 50 then 1 else 0 end) from generate_series(1,100) gs;
Using old fashioned casting operator ::
:
select sum((gs > 50)::int) from generate_series(1,100) gs;
Direct aggregation of boolean values (not working currently):
select sum(gs > 50) from generate_series(1,100) gs;
Is direct aggregation of boolean values possible in other DBMSs? Why this is not possible in PostgreSQL?