2

I have two different tables with their different columns as below:

CREATE TABLE T1(C1 INT)
CREATE TABLE T2(C2 INT)

Every programmer knows if we write a query with wrong syntax, query compiler should give us an error. Such as this one:

SELECT C1 FROM T2 
--ERROR: Invalid column name 'C1'.

But if we use this wrong query as inner select, unfortunately SQL will execute it:

SELECT * 
FROM T1
WHERE C1 IN (SELECT C1 FROM T2) 
--returns all rows of T1

And also the following wrong query will execute too and returns all rows of T1

SELECT * 
FROM T1
WHERE EXISTS (SELECT C1 FROM T2) 
--returns all rows of T1

It gets worse when we use these wrong queries in UPDATE such as:

UPDATE T1
SET C1 = NULL
WHERE C1 IN (SELECT C1 FROM T2) 
--updates all rows of T1

Now, I want to prevent this bug. I can force my DB developers to be careful but is there any systematic way to prevent this bug?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Merta
  • 965
  • 1
  • 11
  • 23
  • 1
    Always use a two part name - `Table.Column`. – ta.speot.is Jan 14 '18 at 07:17
  • Just to be clear, this is not a SQL Server bug, The examples are a bug in the application query. Ambiguous column names are resolved the innermost scope. The best practice in multi-table queries is to qualify columns with the table name or alias to avoid surprises and improve maintainability. – Dan Guzman Jan 14 '18 at 13:08
  • You can prevent updates via trigger – Dejan Dozet Jan 14 '18 at 16:50

2 Answers2

6

Ever heard of Correlated Subquery, you can always refer outer query columns inside the subquery

am sure you must have seen queries like this

SELECT * FROM T1 
WHERE EXISTS (SELECT 1 FROM T2 where t1.c1 = t2.c2) 

here C1 column from T1 is referred in Where clause, you are referring in Select thats the difference. There is no BUG here

Pரதீப்
  • 91,748
  • 19
  • 131
  • 172
  • Yes, this is quite common and useful. – ta.speot.is Jan 14 '18 at 07:19
  • 1
    From the documentation: `In queries that include a correlated subquery (also known as a repeating subquery), the subquery depends on the outer query for its values. This means that the subquery is executed repeatedly, once for each row that might be selected by the outer query`. Which means this might execute quite poorly in some cases (although I'm inclined to believe the optimizer is going to Do Something About It). – Clockwork-Muse Jan 14 '18 at 07:20
  • I'd imagine most enterprise grade databases will be able to rewrite a typical correlated subquery so that the query is executed only once.. if it makes you feel happier though, there's usually little harm in taking the correlated query and inner joining it as a subquery in the join area. Profiling the two will tell you whether your database successfully rewrote it – Caius Jard Jan 14 '18 at 20:03
5

Is there any systematic way to prevent this bug?

Always use two-part names - [Table].[Column].

ta.speot.is
  • 26,914
  • 8
  • 68
  • 96