8

I have some kind of impossible request :).

I have a table where one of the columns is named type. I would like to SELECT 3 records for each type in that column. Is that possible?

Note also that I'm using MySQL and Sphinx.

UPDATE: Table structure

id       title        type
1        AAAA         string1
2        CCCC         string2
3        EEEE         string2
4        DDDD         string2
5        FFFF         string2
6        BBBB         string2
6        BBBB         string2

What I want my MySQL to return is (up to 3 records for each type ordered by title):

id       title        type
1        AAAA         string1
6        BBBB         string2
2        CCCC         string2
4        DDDD         string2
RichardTheKiwi
  • 105,798
  • 26
  • 196
  • 262
xpepermint
  • 35,055
  • 30
  • 109
  • 163
  • 3
    First, almost nothing is impossible. Second, why not show your table structure and what you've tried so far. – ircmaxell Jan 23 '11 at 19:14

4 Answers4

13
select id, title, type
from   (select id, title, type,
               @num := if(@group = type, @num + 1, 1) as row_number,
               @group := type as dummy
        from   your_table
        order by type, title) as x
where  row_number <= 3

(Uses a different article on the same site as Martin Wickman's answer!)

Community
  • 1
  • 1
Martin Smith
  • 438,706
  • 87
  • 741
  • 845
3

If you have an index on (type, title), and you know the possible values for type, I believe that dynamic SQL is the way to go (for once) for best performance.

For each possible value of type, add a union all and a select for that specific type. The final query will look like the following query:

(select * from t1 where type = 'string1' order by title limit 3)
  union all
(select * from t1 where type = 'string2' order by title limit 3)
  union all
(select * from t1 where type = 'string3' order by title limit 3);

It executes in less than 1 second on a table with 1,000,000 rows, whereas the others solutions (Martins & Cyberkiwis) takes roughly 11 seconds.

The difference is because the unioned query above can fetch the first three title entries for each type and then stop, whereas the simulated analytics function has to scan the entire table.

Ronnis
  • 12,593
  • 2
  • 32
  • 52
  • My database has 200k records and there are more then 500 types. What do you suggest? – xpepermint Jan 23 '11 at 23:11
  • @xpepermint, go with Martins solution. Just know that it will become slower as you add records. At some point, it will become faster to perform 500 queries in a loop. Depending on your setup this may already be the case. You have to measure yourself. – Ronnis Jan 24 '11 at 09:28
2

When the table is large and collection is more unpredictable, the row numbering needs to be ordered by type in the inner query for the side-effecting variables to work.

select id, title, type
from (select id, title, type,
        @r := CASE WHEN @g = type THEN @r+1 ELSE 1 END r,
        @g := type
      from tbl
      order by type, title) as x
where row_number <= 3
# order by type, title

Another way to do this without using side effecting variables, if no two records are exactly the same on (title, type, id), is given below. This uses only standard ANSI SQL92 SQL. It may be slower than the above though.

select A.id, A.title, A.type
from tbl A
left join tbl B on
   A.title = B.title and
   (A.type < B.type or
   (A.type = B.type and A.id < A.id))
group by A.id, A.title, A.type
having count(B.title) <= 2
RichardTheKiwi
  • 105,798
  • 26
  • 196
  • 262
0

Check out this article. Given:

+--------+------------+-------+
| type   | variety    | price |
+--------+------------+-------+
| apple  | gala       |  2.79 | 
| apple  | fuji       |  0.24 | 
| apple  | limbertwig |  2.87 | 
| orange | valencia   |  3.59 | 
| orange | navel      |  9.36 | 
| pear   | bradford   |  6.05 | 
| pear   | bartlett   |  2.14 | 
| cherry | bing       |  2.55 | 
| cherry | chelan     |  6.33 | 
+--------+------------+-------+

Query:

select type, variety, price
from fruits
where (
   select count(*) from fruits as f
   where f.type = fruits.type and f.price < fruits.price
) <= 2;
Martin Wickman
  • 19,662
  • 12
  • 82
  • 106
  • This query will show ties, so if 4 fruits have the same price, they all show (instead of 3). There has also been no attempt to link to the question. – RichardTheKiwi Jan 23 '11 at 19:34
  • If six varieties of apples shared the same, lowest price, all six would be returned. But the OP's example I'm thinking that id is probably a unique identifier (if, that is, the two "6"s are produced as the result of a JOIN) and you could use the id column instead of the price column to get a guaranteed correct result. – Larry Lustig Jan 23 '11 at 19:35