18

currently im working with mysql 5.7 in development, and 5.6 in production. Each time i run a query with a group by in development i get some error like "Error Code: 1055. Expression #1 of SELECT list is not in GROUP BY "

Here is the query.

SELECT c.id, c.name, i.* 
 FROM countries c, images i 
WHERE i.country_id = c.id
GROUP BY c.id; Fixed for 5.7; 

SELECT c.id, c.name,
       ANY_VALUE(i.url) url, 
       ANY_VALUE(i.lat) lat, 
       ANY_VALUE(i.lng) lng 
  FROM countries c, images i
 WHERE i.country_id = c.id
 GROUP BY c.id;

For solving that I use the mysql function from 5.7 ANY_VALUE, but the main issue is that its not available in mysql 5.6

So if I fix the sql statement for development i will get an error in production.

Do you know any solution or polifill for the ANY_VALUE function in mysql 5.6?

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
Tim
  • 585
  • 1
  • 5
  • 14

3 Answers3

21

You're misusing the notorious nonstandard MySQL extension to GROUP BY. Standard SQL will always reject your query, because you're mentioning columns that aren't aggregates and aren't mentioned in GROUP BY. In your dev system you're trying to work around that with ANY_VALUE().

In production, you can turn off the ONLY_FULL_GROUP_BY MySQL Mode. Try doing this:

  SET @mode := @@SESSION.sql_mode;
  SET SESSION sql_mode = '';
  /* your query here */
  SET SESSION sql_mode = @mode;

This will allow MySQL to accept your query.

But look, your query isn't really correct. When you can persuade it to run, it returns a randomly chosen row from the images table. That sort of indeterminacy often causes confusion for users and your tech support crew.

Why not make the query better, so it chooses a particular image. If your images table has an autoincrement id column you can do this to select the "first" image.

SELECT c.id, c.name, i.*
  FROM countries c
  LEFT JOIN (
       SELECT MIN(id) id, country_id
         FROM images
        GROUP BY country_id
       ) first ON c.id = first.country_id
  LEFT JOIN images i ON first.id = i.id

That will return one row per country with a predictable image shown.

Graham
  • 7,431
  • 18
  • 59
  • 84
O. Jones
  • 103,626
  • 17
  • 118
  • 172
  • 1
    pd: i hate subselects! arent they slower? – Tim Jul 22 '17 at 15:26
  • 2
    I guess, by *subselect*, you mean a join to an aggregate like I have. In my experience, their performance is, like other query performance, determined by choice of indexes and other such things. The one I've shown can greatly speed up a query. It reduces the number of rows to consider. And the aggregate query can use the astonishingly efficient loose index scan do to its thing. Read about that. – O. Jones Jul 22 '17 at 17:19
  • 2
    Of note: ANY_VALUE executes much faster than MIN because MIN requires a full evaluation of the dataset. MIN should only be used when you expect there to be multiple rows in table 'i' for each row in table 'c', but when you you join such that you have many values of 'c' per 'i', then ANY_VALUE should be preferred since all of the values are going to be the same anyway. – Nosajimiki Oct 07 '19 at 15:21
19

Instead of ANY_VALUE, you could use the MIN or MAX aggregate functions.

Alternatively, you might consider not setting the ONLY_FULL_GROUP_BY SQL mode, which is set by default since MySQL 5.7.5, and is responsible for the difference you experience with MySQL 5.6. Then you can delay the update of your queries until you have migrated all your environments to a current version of MySQL.

Which of the two is the better option, is debatable, but in the long term it will be better to adapt your queries so they adhere to the ONLY_FULL_GROUP_BY rule. Using MIN or MAX can certainly be of use in doing that.

trincot
  • 317,000
  • 35
  • 244
  • 286
4

For decades you could write queries that were not valid in standard SQL but perfectly valid mysql

In standard SQL, a query that includes a GROUP BY clause cannot refer to nonaggregated columns in the select list that are not named in the GROUP BY clause. For example, this query is illegal in standard SQL because the nonaggregated name column in the select list does not appear in the GROUP BY:

SELECT o.custid, c.name, MAX(o.payment) FROM orders AS o, customers AS c WHERE o.custid = c.custid GROUP BY o.custid; For the query to be legal, the name column must be omitted from the select list or named in the GROUP BY clause.

MySQL extends the standard SQL use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause.

This comes from the Mysql 5.6 manual's page on GROUP BY. If you look at the same page for 5.7.6 you see that things have changed. And changed dramatically!!

That page also gives you the solution. Disable ONLY_FULL_GROUP_BY That will make it possible for your old 5.6 query to work on 5.7.6 (remove ANY_VALUE from your query since it's not available in 5.7.6 but use the ONLY_FULL_GROUP_BY instead).

e4c5
  • 52,766
  • 11
  • 101
  • 134