3

Take the following table:

CREATE TABLE boxes (
    box integer,
    color character varying,
    size integer,
    ...
);

where both box and color can assume not unique values out of a small set.

Querying this table with:

SELECT color, box FROM boxes;

the result will be something like:

+-------+-----+
| color | box |
+-------+-----+
| blue  |   2 |
| blue  |   3 |
| blue  |   4 |
| green |   1 |
| green |   3 |
| red   |   1 |
| red   |   2 |
| red   |   2 |
+-------+-----+

Is it possible to query this table in a manner such that the result has two columns, one with an array (or string, or list) with all the different box values for each distinct color?

The result should be something like this:

+-------+-----------+
| color | box_types |
+-------+-----------+
| blue  | {2,3,4}   |
| green | {1,3}     |
| red   | {1,2}     |
+-------+-----------+

where the color column must contain unique values, and each row must contain only distinct box numbers in the aggregate column.

Given the non-agnostic character of this question, I would like to collect all the best solutions for the major DBMS. When answering, please specify for which DBMS each query works.

potashin
  • 44,205
  • 11
  • 83
  • 107
Claudio Floreani
  • 2,441
  • 28
  • 34
  • 1
    That's not really the SQL way to do things. However, some dbms products have funcionality like GROUP_CONCAT and STUFF. – jarlh Sep 18 '15 at 09:28
  • I suppose someone has to merge all the answers into one... – xpy Sep 18 '15 at 09:50
  • Yes, that's the intent. – Claudio Floreani Sep 18 '15 at 09:51
  • 1
    Since you do GROUP BY color, SELECT without DISTINCT will do fine. (Usually no need for SELECT DISTINCT when GROUP BY.) – jarlh Sep 18 '15 at 09:54
  • After some more testing I've found that simply using a function like `group_concat()` or `array_agg()` doesn't remove duplicates from the resulting aggregate column. I've edited the question to make it more clear that each resulting row must contain only different box values. – Claudio Floreani Sep 18 '15 at 13:21

3 Answers3

4

Try below.

SELECT
    color ,
     STUFF(
         (SELECT DISTINCT ',' +CONVERT(varchar(10), box)   
          FROM boxes 
          WHERE color = a.color 
          FOR XML PATH (''))
          , 1, 1, '') AS box_types
FROM boxes  AS a
GROUP BY color;

Check SQL Fiddle

Claudio Floreani
  • 2,441
  • 28
  • 34
Amnesh Goel
  • 2,617
  • 3
  • 28
  • 47
  • This is a valid answer for MS SQL Server. Please note that curly brackets were in the answer because it was an array column instead of a string. I've removed them from your answer. – Claudio Floreani Sep 18 '15 at 13:42
3

Well, in MySQL you can do the following :

select color, group_concat(box) from tbl group by color

In Oracle:

select color, wm_concat(box) from tbl group by color
potashin
  • 44,205
  • 11
  • 83
  • 107
  • 1
    It's not necessary to add curly brackets, they where in the answer for the case you group using an array column instead of one string. So `SELECT color, group_concat(box) FROM boxes GROUP BY color;` could be fine if you find the way to get only and all the distinct `box` numbers for each `color`. – Claudio Floreani Sep 18 '15 at 13:34
0

First of all, this is the negation of the principle of "normalization", in other words it's "bad".

However, there are some dbms, like Microsoft SQL Server, that implement this possibility with the clause PIVOT (and its contrary UNPIVOT).

This clause permits to create a table (using your example) like this:

+-------+------+------+------+
| color | box1 | box2 | box3 |
+-------+------+------+------+
| blue  |    2 |    3 |    4 |
| green |    1 |    3 | null |
| red   |    1 |    2 | null |
+-------+------+------+------+
  • I think that you need to know the exact -to be produced- columns in order to use `PIVOT` – xpy Sep 18 '15 at 09:56
  • Modern DBMS supports array columns. If they "negate the principle of normalization" this won't prevent users to gain benefit from their use. In the end, it's only a way to transform data so that they are more suitable for your needs. Take the case where you have data stored in one db table but you need to use them in a different way in another db table. In the example "all different box numbers for each color" is nothing but one value. – Claudio Floreani Sep 18 '15 at 09:56
  • You're right, but - as I learned - you cannot call it an RDBMS if it is not in fifth normal form. This table is not even in third !!! – Sergio Internicola Sep 22 '15 at 11:22