You are using the wrong syntax for COUNT(DISTINCT)
. The DISTINCT
part is a keyword, not a function. Based on the docs, this ought to work:
SELECT
SUM(column1) AS column1,
SUM(column2) AS column2,
COUNT(DISTINCT column3) AS column3
FROM table
Do, however, read the docs. BigQuery's implementation of COUNT(DISTINCT)
is a bit unusual, apparently so as to scale better for big data. If you are trying to count a large number of distinct values then you may need to specify a second parameter (and you have an inherent scaling problem).
Update:
If you have a large number of distinct column3
values to count, and you want an exact count, then perhaps you can perform a join instead of putting a subquery in the select list (which BigQuery seems not to permit):
SELECT *
FROM (
SELECT
SUM(column1) AS column1,
SUM(column2) AS column2
FROM table
)
CROSS JOIN (
SELECT count(*) AS column3
FROM (
SELECT column3
FROM table
GROUP BY column3
)
)
Update 2:
Not that joining two one-row tables would be at all expensive, but @FelipeHoffa got me thinking more about this, and I realized I had missed a simpler solution:
SELECT
SUM(column1) AS column1,
SUM(column2) AS column2,
COUNT(*) AS column3
FROM (
SELECT
SUM(column1) AS column1,
SUM(column2) AS column2
FROM table
GROUP BY column3
)
This one computes a subtotal of column1
and column2
values, grouping by column3
, then counts and totals all the subtotal rows. It feels right.