1

Somebody already helped me with this query but I made an adaptation and I get a problem :

    SELECT 
        AVG(tyd.price) AS avg_price, COUNT(tyd.id_product) AS cnt, 
        tyd.id_marchand, tyd.id_product, 
        catalog.price AS c_price, tyd.price AS t_price, 
        tyd.amount AS t_am, pro_tyd.amount AS p_am, 
        pro_tyd.price AS p_price, catalog.img_src,  
        tyd.step, tyd.login AS tyd_l
    FROM catalog 
    INNER JOIN tyd ON catalog.id_marchand = tyd.id_marchand 
                   AND catalog.id_product =   tyd.id_product
                   AND tyd.step = "1" 
    INNER JOIN pro_tyd  ON tyd.id_marchand = pro_tyd.id_marchand 
                        AND tyd.id_product = pro_tyd.id_product
    GROUP BY 
         catalog.id_product, catalog.id_marchand
    HAVING 
         tyd.login = "user1@tyd.fr"

and it only works when tyd.login = "user3@tyd.fr" which is the lower id. doesn't work with user1 or user2...I just can't figure why...! Also it work if i group by tyd.login but in this case the AVG and COUNT function work only on a single line...

Here is the table :

id  id_marchand  id_product   login       price  amount  delay  step    time   
29      1           1       user3@tyd.fr   344     1       0      1   1343297500
120     1           1       user1@tyd.fr   54      1       0      1   1343297504
109     1           1       user10@tyd.fr  34      1       0      1   1343298598

When HAVING tyd.login = "user3@tyd.fr" it works perfectly. When user1 or user2 i got 0 lines.

thanks for your help

First subject : Sql, get the average on a group while escaping the where condition

Community
  • 1
  • 1
pierreaurelemartin
  • 1,592
  • 1
  • 14
  • 22
  • 1
    You're abusing `HAVING`, which is normally used for group filtering or with aggregate functions. I think you should just use a `WHERE tyd.login = "user1@tyd.fr"`. Also, you don't have a "user1", but a "user10". – Cᴏʀʏ Jul 31 '12 at 16:45
  • if i don't use the having clause i only get the AVG and COUNT of the userX line. And i want the AVG and COUNT of every line where it's the same id_product and id_marchand. Sorry for the user1 and 10, i correct it. – pierreaurelemartin Jul 31 '12 at 16:50
  • What do you Exactly want from this query? and what do you mean by **When HAVING tyd.login = "user3@tyd.fr" it works perfectly.**. Can you please explain more? – hmmftg Jul 31 '12 at 17:36
  • I want AVG and COUNT of every lines where id_product and id_marchand are equal of the id_product and id_marchand WHERE step = 1 HAVING userX in login. then the other select stuff of this line. The line i putted on my asking are unique. An user can't get on step = 1 two times for the same product. – pierreaurelemartin Jul 31 '12 at 17:41
  • When i do my query with HAVING tyd.login = "user3" i get the whole AVG & COUNT + my selected row for that line. When i put HAVING tyd.login = user10, i get an empty answer. The only difference between user3 and user1 line is the record order. For example, if i delete user3 line, the query will work with user1 but not with user10. – pierreaurelemartin Jul 31 '12 at 17:44

2 Answers2

3

The problem is that your query is non deterministic. You are selecting more columns that you are grouping by where the additional columns are not determined by the columns you are grouping by. If the latter had been true, then this would fall into the ANSII standards of SQL, since it is not it is (in my opinion) a failure of MySQL that it allows the statement to run at all. Other DBMS have gone the opposite way, since they cannot determine if certain columns are functions of other columns not contained in the group by, they will now allow any statement with columns in the select list that are not contained within the group by.

To try and simplify the problem take the following dataset (Table T)

ID    Col1    Col2
1     1       1
2     1       3

Running this:

SELECT Col1, MAX(Col2) AS MaxCol2, MIN(Col2) AS MinCol2, AVG(Col2) AS AvgCol2
FROM T
GROUP BY Col1

Will always return

Col1    MaxCol2    MinCol2    AvgCol2
1       3          1          2

However, if you throw ID into the mix

SELECT ID, Col1, MAX(Col2) AS MaxCol2, MIN(Col2) AS MinCol2, AVG(Col2) AS AvgCol2
FROM T
GROUP BY Col1

There is no way of determining which ID will be returned, 1 or 2, the most likely result is

ID    Col1    MaxCol2    MinCol2    AvgCol2
1    1       3          1          2

However there is nothing defined in the SQL to state that the result could not be:

ID    Col1    MaxCol2    MinCol2    AvgCol2
2     1       3          1          2

So if if the above was the result set, and you added HAVING ID = 1 to the query, you would get no results due to the point at which the HAVING clause is applied to the data. If you were to Add ID to the GROUP BY you would end up with 2 rows, which as I understand it is not what you want, and if you were to add it to the WHERE your MIN, MAX and AVG functions would be affected. So you need to use a subquery. So in this example I would use

SELECT  T.ID, T.Col1, MaxCol2, MinCol2, AvgCol2
FROM    T
        INNER JOIN
        (   SELECT Col1, MAX(Col2) AS MaxCol2, MIN(Col2) AS MinCol2, AVG(Col2) AS AvgCol2
            FROM T
            GROUP BY Col1
        ) T2
            ON T.Col1 = T2.Col1
WHERE   ID = 1 -- OR ID = 2 DEPENDING ON REQUIREMENTS

To apply this to your situation, the database engine has deteremined that the row that will be returned for the columns not in the group by is the row containing ID = 29. So your HAVING Clause is only being applied to this row, the rows with user1@tyd.fr, and user10@tyd.fr in have already been removed from the results by the time your HAVING clause is applied. You need to perform the aggregate functions separately to the filtering.

Now I haven't fully got my head around your schema, but I am hoping I have explained the issue of the non deterministic statement well enough that you can make any ammendments required to my attempt at rewriting your query

SELECT  Avg_Price,
        Cnt,
        tyd.id_marchand, 
        tyd.id_product, 
        catalog.price AS c_price, 
        tyd.price AS t_price, 
        tyd.amount AS t_am, 
        pro_tyd.amount AS p_am, 
        pro_tyd.price AS p_price, 
        catalog.img_src,  
        tyd.step, 
        tyd.login AS tyd_l
FROM    Catalog
        INNER JOIN tyd
            ON catalog.id_marchand = tyd.id_marchand 
            AND catalog.id_product = tyd.id_product
        INNER JOIN Pro_tyd
            ON tyd.id_marchand = pro_tyd.id_marchand 
            AND tyd.id_product = pro_tyd.id_product
        INNER JOIN
        (   SELECT  ID_Marchand, ID_Product, Step, AVG(tyd.price) AS avg_price, COUNT(tyd.id_product) AS cnt
            FROM    Tyd
            WHERE   Step = '1'
            GROUP BY ID_Marchand, ID_Product, Step
        ) Agg
            ON Agg.id_marchand = pro_tyd.id_marchand 
            AND Agg.id_product = pro_tyd.id_product
            AND Agg.Step = tyd.Step
WHERE   tyd.Login = 'user1@tyd.fr'
GarethD
  • 68,045
  • 10
  • 83
  • 123
  • I totally agree and understand with your explaination. I don't have try your solution but i will and i'm not familiar with subqueries so it helps me. But if i follow your post, the problem should be the same with the query there (Marco) : http://stackoverflow.com/questions/11740458/sql-get-the-average-on-a-group-while-escaping-the-where-condition but i get all the line and not only the first one. In that query, the having is not selective on one row. But may be it's not exactly the same group by row. I'll look asap. thanks a lot for your help. – pierreaurelemartin Aug 01 '12 at 10:25
  • So the query you gave me is perfectly working. But really, i don't understand why it's working with step = "0" and not when step = "1". do you think i must rewrite the first query as you told me ? – pierreaurelemartin Aug 01 '12 at 12:28
  • I've updated the answer with a possible solution. I am still not sure exactly what you are after, or what the average should be of or how it should be grouped. But hopefully the new query will get you in the right direction. – GarethD Aug 01 '12 at 12:48
  • Sorry i expressed myself wrong. The query that you gave me produce exactly the result i was expecting for ! What i don't understand is why the query which Marco gave me there : http://stackoverflow.com/questions/11740458/sql-get-the-average-on-a-group-while-escaping-the-where-condition works with step = 0 and doesn't work with step = 1. When step = 0 i get the good result, when step = 1 i get empty result...i'd like to understand why that's all. But thanks a lot for your query which is perfect ! – pierreaurelemartin Aug 01 '12 at 13:30
0

i just want to add that this query :

SELECT * FROM  tyd WHERE step = "1" 
GROUP BY tyd.id_product, tyd.id_marchand 
HAVING tyd.login = "user1@tyd.fr" 

Has the same problem. Works when tyd.login = "user3" but not when an other user is ask...

pierreaurelemartin
  • 1,592
  • 1
  • 14
  • 22
  • If you take my table in first post, i obtain this line : 120 / 1 / 1 / user3@tyd.fr / 432 (354 + 44 + 34 /3) / 3 (1+1+1) / 1 / time...when i put HAVING login = user3 and zero lines when HAVING login = user1 – pierreaurelemartin Jul 31 '12 at 18:45