0

I have done some tests with Having In Google BigQuery and found out: I can use the column alias in HAVING without error

SELECT
  geoNetwork_city as total_CITY,
 sum( totals_transactions) as total_transcations, 
FROM
  tablename
GROUP BY
  geoNetwork_city
  HAVING total_CITY not like 'NEW YORK'
  ORDER by distinct_visitors desc

But isn't HAVING clause got executed before SELECT clause, which the alias hasn't formed yet? Why will we not receive any error?

Also, can we replace where with Having for non-aggregation function?

SELECT
  geoNetwork_city as total_CITY
FROM
  tablename
HAVING 
 total_CITY not like 'NEW YORK'

2 How about alias using in MS SQL Server? I found in MS SQL, the only place (I'm aware of) that you can reference aliases is in the ORDER BY clause. Is this correct?

NewPy
  • 623
  • 1
  • 5
  • 14
  • I think, for performance reasons that you could probably check with execution plans, that filtering on non-aggregate columns (testing against 'New York' in this case) is better done in a WHERE rather than the HAVING. [This question](https://stackoverflow.com/questions/5657876/does-mysql-use-indexes-on-having) has answers that suggest HAVING might not use indexes, for example. – Phil M Dec 08 '18 at 01:22
  • @Gserg thanks. I have tested in Bigquery and looks ok. Since you mentioned it, I did remember I have experience error in SQL server. This is why I tagged both and would like to see whether it is general SQL standard rule or as per different DBMSs cases – NewPy Dec 08 '18 at 01:26

2 Answers2

2

Below is for BigQuery Standard SQL

WHERE is applied before, but HAVING is applied after SELECT with GROUP BY so alias is visible for HAVING - that is why you do not see any error as it is correct

Also HAVING is applicable for cases when you use GROUP BY - otherwise you will get error like below

The HAVING clause requires GROUP BY or aggregation to be present   

So, below will produce error

#standardSQL
SELECT geoNetwork_city as total_CITY
FROM `project.dataset.tablename`
HAVING total_CITY not like 'NEW YORK'

while below will work

#standardSQL
SELECT geoNetwork_city, 
  SUM( totals_transactions) as total_transcations
FROM `project.dataset.tablename`
WHERE geoNetwork_city NOT LIKE 'NEW YORK'
GROUP BY geoNetwork_city
HAVING total_transcations > 30
Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230
2

HAVING is filtering after the GROUP BY. SQL queries are not executed clause-by-clause, but they are parsed clause-by-clause. For common clauses the ordering is:

  • FROM . . . this defines the table aliases
  • WHERE . . . filters the rows
  • GROUP BY . . . defines the rows
  • SELECT . . . defines column aliases
  • HAVING . . . filters the rows when GROUP BY is present
  • ORDER BY . . . orders the result set
  • LIMIT/FETCH . . . limits the number of rows

This order can vary slightly by database (a bit more below).

In pretty much all databases, you can order by column aliases. This is true in MS SQL and BigQuery. This is part of the ISO/ANSI standard.

In some databases, you can GROUP BY column aliases (which affects the above ordering). That includes BigQuery (and MySQL and Postgres), but not SQL Server.

In some databases, you can refer to column aliases in the HAVING clause. That includes BigQuery (and MySQL and Postgres and maybe a few more).

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786