2

I have fetch 3 highest salary from my table Here my query

SELECT DISTINCT salary
FROM salary e1
WHERE 3 = (
SELECT count( DISTINCT salary )
FROM salary e2
WHERE e1.salary >= e2.salary ) 

It's working fine but I want know, why we put condition "Where 3"? how it is performing?

Jaskaran Zap
  • 277
  • 1
  • 2
  • 9

2 Answers2

3

Your query is slow not because of that 3 but because it unnecessarily uses in inner query. The answer provided by @spencer7593 explains how it works and you can see it does the same thing again and again.

The query can be written as simple as this:

SELECT DISTINCT e1.salary
FROM salary e1
ORDER BY e1.salary DESC
LIMIT 3;

Note:

The alias e1 is not needed but I let it in the FORM clause and used it in the ORDER BY clause to make clear what represents each instance of salary (both the table and the field have this name).

How it works:
It ORDERs the rows from table "salary" BY field salary descending (greatest salaries first), SELECTs only DISTINCT values (this is where you wrote it well) and LIMITs the result set to 3 rows.

In fact, it stops the execution after it finds the 3 greatest distinct values from column salary.

If the table has an index on field salary it runs like the wind. If it hasn't such an index, create it now:

ALTER TABLE `salary` ADD INDEX (`salary`);
Community
  • 1
  • 1
axiac
  • 68,258
  • 9
  • 99
  • 134
2

The equality comparison is comparing two values. A constant value of 3 on the one side, and the value returned by a subquery on the other side.

If the result of the comparison is TRUE for a row being evaluated, then the row will be returned. Otherwise, the row won't be returned.

The "trick" here is the subquery on the right side. That's what we call a "correlated" subquery. That gets run for every row that's being evaluated.

Basically, the operation can be described like this...

Access a row from salary table, and get the value of the salary column from it.

Run another query, using the value of salary column to get a count.

Compare that count to a constant value of 3.

If the comparison yields TRUE, return the row, otherwise discard it.

Access the next row from the salary table and get the salary value.

Run another query, using the value we just got, to get a count.

Compare the count to a constant value of 3.

If the comparison yields TRUE, return the row, otherwise discard.

And just repeat that until we've done it for all rows in the salary table.

The final step is the DISTINCT operation, which eliminates any duplicates that we would otherwise be returning.

That's how it's performing.


In terms of performance, on large sets, a correlated subquery is going to each our lunch, and our lunch box too. This is not the only query pattern that can return this result, there are other queries that will do less work to return an equivalent result.

spencer7593
  • 106,611
  • 15
  • 112
  • 140