1

Product table:

id: int
name: varchar(255)
info: varchar(8000)

I need to get a list of all products, grouped by name, and if some products have the same name then the one with the most data in info field should be picked to show. Another requirement is consistency - the same products should be guaranteed to be displayed every time. But performance isn't my concern at all with this.

So the best I have come up with so far is this:

select * from (select * from product order by length(info) desc) as product group by name

My understanding that this should work in MySQL but does not guarantee the results.

Is there any better way?

UPDATE:

An example of what I am looking for. If the data is

1, 'Product1', '12345'
2, 'Product1', '123'
3, 'Product2', '123456'
4, 'Product2', '123456'
5, 'Product2', '12'

The expected results should be either:

1, 'Product1', '12345'
3, 'Product2', '123456'

or

1, 'Product1', '12345'
4, 'Product2', '123456'

I don't care which one of the two as long as it's the same result every time I run the query.

Leo
  • 1,016
  • 1
  • 13
  • 32
  • What do you mean by consistency? Don't you want to include everything in the product table? – Gordon Linoff Jul 25 '12 at 14:41
  • Consistency as in getting the same result every time you run the query. With my example MySQL doesn't guarantee which product will be picked up with group by, meaning the result can potentially change every time you execute the query. – Leo Jul 25 '12 at 15:08
  • You say which "product", but the query is returning all products. The question is which "info". What consistency do you need with that, once you have gotten the longest one? – Gordon Linoff Jul 25 '12 at 15:36
  • I mean returning the same id basically. The query doesn't return all products - if two products have the same name only one of them will appear in the results. I'll update the question with the expected output – Leo Jul 25 '12 at 15:55

1 Answers1

0

I think you need to do a join:

select p.*
from product p join
     (select p.name, max(len(info)) as maxinfolen
      from product p
      group by p.name
     ) t
     on p.name = t.name and
        len(p.info) = t.maxinfolen

This will return duplicates, if the maximum length is repeated in info. To eliminate duplicates, you can group by on the outside:

select p.name, max(info)
from product p join
     (select p.name, max(len(info)) as maxinfolen
      from product p
      group by p.name
     ) t
     on p.name = t.name and
        len(p.info) = t.maxinfolen
group by p.name

If you are trying to choose the same id every time, you can use the same idea:

select p.*
from product p join
     (select p.name, min(id) as minid
      from product p join
           (select p.name, max(len(info)) as maxinfolen
            from product p
            group by p.name
           ) t
           on p.name = t.name and
              len(p.info) = t.maxinfolen
      group by p.name
     ) t
     on p.name = t.name and
        p.id = t.minid
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • still need to guarantee the consistency in such cases: the same product should always be chosen in the group by. so if a new product with same length is added, the first one picked already would be returned and not this new one. –  Jul 25 '12 at 14:12
  • and i don't see a way to do that without the help of another table or means to register previous choices. as otherwise the query above answers the requirements. –  Jul 25 '12 at 14:21
  • that doesn't really guarantee the consistency. If two products have the same name and the same len(info), MySQL doesn't guarantee which one will be picked – Leo Jul 25 '12 at 15:11
  • @Leo, The second version of the query returns the maximum value of info for the maximum length. This is consistent, unless a new value is inserted. – Gordon Linoff Jul 25 '12 at 15:39
  • Hmm, I'm not really sure if this does guarantee the consistent result. I've update my question with an example, are you sure that the same 'Product2' will be picked up every time? – Leo Jul 25 '12 at 16:03
  • @Leo . . . the original question was quite ambiguous, since you wanted a consistent info, not id. The third modification does this. – Gordon Linoff Jul 25 '12 at 17:46
  • Thanks Gordon, thats what I was looking for! Sorry about not being clear enough – Leo Jul 26 '12 at 06:32