1

i have those tables

     table1
|  id  |  name  |
|  1   |  axe   |
|  2   |  bow   |
|  3   |  car   |
|  4   |  dart  |


        table2                                        table3
|  t1_id  |  number  |                        |  t1_id  |  letter  |
|  1      |  5       |                        |  1      |  a       |
|  1      |  6       |                        |  1      |  b       |
|  1      |  2       |                        |  1      |  c       |
|  2      |  2       |                        |  2      |  a       |
|  2      |  2       |                        |  2      |  c       |
|  2      |  3       |                        |  2      |  r       |
|  3      |  8       |                        |  3      |  y       |
|  3      |  3       |                        |  3      |  i       |
|  3      |  1       |                        |  3      |  a       |
|  4      |  8       |                        |  4      |  a       |
|  4      |  9       |                        |  4      |  b       |
|  4      |  10      |                        |  4      |  c       |

and table1(id) is linked with table2(t1_id), table3(t1_id)

i run it to get them order by the highest letter_count match then by the highest average_number match to get this proper result http://www.sqlfiddle.com/#!9/69086b/8/0

SELECT 
  t1.id, 
  t1.name

FROM 
  table1 t1

INNER JOIN 
  table2 t2
    ON t2.t1_id = t1.id

LEFT JOIN 
  table3 t3
    ON t3.t1_id = t1.id
      AND t3.letter IN ('a', 'b', 'c')

GROUP BY
  t1.id

ORDER BY
  COUNT(t3.letter) DESC,
  AVG(t2.number) DESC

|  id  |  name  |
|  4   |  dart  |
|  1   |  axe   |
|  2   |  bow   |
|  3   |  car   |

and everything is working ok


but when i wanted to check if there is any problems with the query i decided to check the letter_count and avg_number so i used this query

SELECT 
  t1.id, 
  t1.name, 
  COUNT(t3.letter) AS letter_count, 
  AVG(t2.number) AS avg_number

FROM 
  table1 t1

INNER JOIN 
  table2 t2
    ON t2.t1_id = t1.id

LEFT JOIN 
  table3 t3
    ON t3.t1_id = t1.id
      AND t3.letter IN ('a', 'b', 'c')

GROUP BY
  t1.id

ORDER BY
  letter_count DESC,
  avg_number DESC

what i expected the result to be was

|  id  |  name  |  letter_count  |  avg_number   |
|  4   |  dart  |  3             |  9            |
|  1   |  axe   |  3             |  4.3333333333 |
|  2   |  bow   |  2             |  2.3333333333 |
|  3   |  car   |  1             |  4            |

but the result i got was http://www.sqlfiddle.com/#!9/69086b/3/0

|  id  |  name  |  letter_count  |  avg_number   |
|  4   |  dart  |  9             |  9            |
|  1   |  axe   |  9             |  4.3333333333 |
|  2   |  bow   |  6             |  2.3333333333 |
|  3   |  car   |  3             |  4            |

what surprised me was the multiplied row of letter_count which can be solved by derived queries but i don't want to select the letter_count or number_average i only want to ORDER BY them

does keeping the query like it is with ORDER BY only wont affect the query performance or should i still use derived queries even if i don't need to select the data values since the order is correct anyway or would derived query be faster in huge tables?

Joe Doe
  • 523
  • 2
  • 9

2 Answers2

1

You've really asked 2 questions here:

  1. Does the ORDER BY clause affect query performance
  2. Why is my letter count not as expected

Either way, to evaluate the ORDER BY clause, the expressions need to be evaluated so that the order can be determined. In your first example you need to specify the expressions because those columns are not contained within the SELECT statement.

However in your second query you have selected the columns that you want to order by and because ORDER BY is evaluated AFTER the entire query has been processed, you can simply use the column ALIAS in the ORDER BY clause, instead of executing the function again.

some RDBMS query optimisers will convert your expressions in the ORDER BY statements to use the column alias for you IF you are ordering on expressions that exist in your SELECT clause

You were right to do this however, your letter count expression is not current due to the duplicates in the response.

You can simply change the COUNT expression to use a distinct clause to only count unique values.

COUNT(DICTINCT t3.letter)

This makes your original query now look like this:

SELECT 
  t1.id, 
  t1.name

FROM 
  table1 t1

INNER JOIN 
  table2 t2
    ON t2.t1_id = t1.id

LEFT JOIN 
  table3 t3
    ON t3.t1_id = t1.id
      AND t3.letter IN ('a', 'b', 'c')

GROUP BY
  t1.id

ORDER BY
  COUNT(DICTINCT t3.letter) DESC,
  AVG(t2.number) DESC
Chris Schaller
  • 13,704
  • 3
  • 43
  • 81
  • Remove the group by statement and include `t3.letter` and `t2.number` in your select, that should highlight why count in your original query is returning a different value to what you are expecting. Over a large record set, this value being wrong might result in a different order than what you are expecting. Again it's up to you if the order is important at all – Chris Schaller Jul 16 '19 at 23:37
  • what would have happened if i didn't added the `DISTINCT` and left it as it is in the future with millions of records? and should i fix the response of duplicates with derived queries or no need to fix it in my case of "only ordering"? – Joe Doe Jul 16 '19 at 23:37
  • 1
    This is a simple enough query, I suspect using sub-queries or derived queries would both complicate this query and make it slower over a much larger record set. It will depend on your derived query though. If you are not happy with the performance of this query then make your derived query and compare the execution times, we can talk theory but the actual execution plan and times will be greatly affected by your specific query implementation, schema and server resources – Chris Schaller Jul 16 '19 at 23:44
0

This retrieves what you require:

SELECT a.id, a.name, count(a.letter) letter_count, avg(a.number) avg_number
FROM( SELECT   
t1.id, 
t1.name, 
t3.letter, 
t2.number--,  
FROM 
table1 t1

INNER JOIN 
table2 t2
ON t2.t1_id = t1.id

INNER JOIN 
table3 t3
ON t3.t1_id = t1.id
    AND t3.letter IN ('a', 'b', 'c')

GROUP BY
t1.id,  t1.name,  t3.letter, t2.number) a GROUP BY id, a.name

ORDER BY
letter_count DESC,
avg_number DESC
lije
  • 420
  • 2
  • 15
  • i refactored it and it worked but with the 2nd query results http://www.sqlfiddle.com/#!9/69086b/17/0 – Joe Doe Jul 16 '19 at 23:35