rule of using HAVING IN Standard RDBMS

Question

I have done some tests with Having In Google BigQuery and found out: I can use the column alias in HAVING without error

SELECT
  geoNetwork_city as total_CITY,
 sum( totals_transactions) as total_transcations, 
FROM
  tablename
GROUP BY
  geoNetwork_city
  HAVING total_CITY not like 'NEW YORK'
  ORDER by distinct_visitors desc

But isn't HAVING clause got executed before SELECT clause, which the alias hasn't formed yet? Why will we not receive any error?

Also, can we replace where with Having for non-aggregation function?

SELECT
  geoNetwork_city as total_CITY
FROM
  tablename
HAVING 
 total_CITY not like 'NEW YORK'

2 How about alias using in MS SQL Server? I found in MS SQL, the only place (I'm aware of) that you can reference aliases is in the ORDER BY clause. Is this correct?

I think, for performance reasons that you could probably check with execution plans, that filtering on non-aggregate columns (testing against 'New York' in this case) is better done in a WHERE rather than the HAVING. [This question](https://stackoverflow.com/questions/5657876/does-mysql-use-indexes-on-having) has answers that suggest HAVING might not use indexes, for example. — Phil M, Dec 08 '18 at 01:22
@Gserg thanks. I have tested in Bigquery and looks ok. Since you mentioned it, I did remember I have experience error in SQL server. This is why I tagged both and would like to see whether it is general SQL standard rule or as per different DBMSs cases — NewPy, Dec 08 '18 at 01:26

Mikhail Berlyant · Answer 1 · 2018-12-08T03:46:05.870

2

Below is for BigQuery Standard SQL

WHERE is applied before, but HAVING is applied after SELECT with GROUP BY so alias is visible for HAVING - that is why you do not see any error as it is correct

Also HAVING is applicable for cases when you use GROUP BY - otherwise you will get error like below

The HAVING clause requires GROUP BY or aggregation to be present

So, below will produce error

#standardSQL
SELECT geoNetwork_city as total_CITY
FROM `project.dataset.tablename`
HAVING total_CITY not like 'NEW YORK'

while below will work

#standardSQL
SELECT geoNetwork_city, 
  SUM( totals_transactions) as total_transcations
FROM `project.dataset.tablename`
WHERE geoNetwork_city NOT LIKE 'NEW YORK'
GROUP BY geoNetwork_city
HAVING total_transcations > 30

edited Dec 08 '18 at 03:46

answered Dec 08 '18 at 01:19

Mikhail Berlyant

165,386
8
154
230

If it is an aggregation function alias for example SELECT userid,sum(transcation_id) as tid FROM usertable GROUP BY userid HAVING tid>30, is it ok? – NewPy Dec 08 '18 at 01:23
yes, should work! that is exactly one of main case for use of HAVING – Mikhail Berlyant Dec 08 '18 at 01:24
1

Thank you @GSerg - Good point - i didn't see tags for `sql` and `sql-server` and answered with `google-bigquery` in mind. updated my answer - to clarify – Mikhail Berlyant Dec 08 '18 at 01:27
your second "#standardSQL" is wrong, the alias `total_CITY` is not declared – Paul Maxwell Dec 08 '18 at 02:56
@Used_By_Already - thank you - was copy-pasting examples from question to modify and ended up leaving it there, so removed it as it was intermediate text – Mikhail Berlyant Dec 08 '18 at 03:52

score 2 · Accepted Answer · answered Dec 08 '18 at 13:16

HAVING is filtering after the GROUP BY. SQL queries are not executed clause-by-clause, but they are parsed clause-by-clause. For common clauses the ordering is:

FROM . . . this defines the table aliases
WHERE . . . filters the rows
GROUP BY . . . defines the rows
SELECT . . . defines column aliases
HAVING . . . filters the rows when GROUP BY is present
ORDER BY . . . orders the result set
LIMIT/FETCH . . . limits the number of rows

This order can vary slightly by database (a bit more below).

In pretty much all databases, you can order by column aliases. This is true in MS SQL and BigQuery. This is part of the ISO/ANSI standard.

In some databases, you can GROUP BY column aliases (which affects the above ordering). That includes BigQuery (and MySQL and Postgres), but not SQL Server.

In some databases, you can refer to column aliases in the HAVING clause. That includes BigQuery (and MySQL and Postgres and maybe a few more).

rule of using HAVING IN Standard RDBMS

2 Answers2