SQL grouping by all the columns

Question

Is there any way to group by all the columns of a table without specifying the column names? Like:

select * from table group by *

Why would this be a meaningless question? Seems like something that any beginner sql programmer might ask. — womp, Apr 29 '09 at 04:50
This would be useful when querying a hive table where rows with null values are not counted when querying `select count(distinct *) from blah` — Jeremy, Oct 27 '21 at 18:20

Elijah · Answer 1 · 2009-04-29T05:41:42.883

64

The DISTINCT Keyword

I believe what you are trying to do is:

SELECT DISTINCT * FROM MyFooTable;

If you group by all columns, you are just requesting that duplicate data be removed.

For example a table with the following data:

 id |     value      
----+----------------
  1 | foo
  2 | bar
  1 | foo
  3 | something else

If you perform the following query which is essentially the same as SELECT * FROM MyFooTable GROUP BY * if you are assuming * means all columns:

SELECT * FROM MyFooTable GROUP BY id, value;

 id |     value      
----+----------------
  1 | foo
  3 | something else
  2 | bar

It removes all duplicate values, which essentially makes it semantically identical to using the DISTINCT keyword with the exception of the ordering of results. For example:

SELECT DISTINCT * FROM MyFooTable;

 id |     value      
----+----------------
  1 | foo
  2 | bar
  3 | something else

edited Apr 29 '09 at 05:41

answered Apr 29 '09 at 05:31

Elijah

13,368
10
57
89

i think that is not correct. if use row_number() over something, the distinct count more. if i have one x_code which have two y_code by join, the "distinct" brings me 2 rows and count two times over row_number(), but if i use group by correctily only brings me one. I'm going through it! I need a sequencial integer registry and "distinct" count me 2 times on x_code. – Natan Medeiros Apr 05 '16 at 12:29
This misses a few important cases such as UDAF's – WestCoastProjects Dec 09 '18 at 06:56
1

Using DISTINCT is not always possible directly. If a query contains the ORDER BY clause, then one might get an error: 'ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list' The only solution in this situation is a nested query AFAIK. – prograils Feb 25 '19 at 12:25

score 12 · Answer 2 · answered Apr 29 '09 at 04:58

If you are using SqlServer the distinct keyword should work for you. (Not sure about other databases)

declare @t table (a int , b int)

insert into @t (a,b) select 1, 1
insert into @t (a,b) select 1, 2
insert into @t (a,b) select 1, 1

select distinct * from @t

results in

a b
1 1
1 2

score 7 · Answer 3 · answered Aug 28 '15 at 07:12

7

I wanted to do counts and sums over full resultset. I achieved grouping by all with GROUP BY 1=1.

answered Aug 28 '15 at 07:12

Vojtěch

11,312
31
103
173

3

I don't know how this works but it's brilliant and exactly what I needed. Is this trick documented somewhere and it is supported by all major relational DBs? – kosmičák Aug 28 '20 at 10:38

score 6 · Answer 4 · answered Apr 29 '09 at 18:09

nope. are you trying to do some aggregation? if so, you could do something like this to get what you need

;with a as
(
     select sum(IntField) as Total
     from Table
     group by CharField
)
select *, a.Total
from Table t
inner join a
on t.Field=a.Field

score 5 · Answer 5 · answered Apr 29 '09 at 04:49

5

Short answer: no. GROUP BY clauses intrinsically require order to the way they arrange your results. A different order of field groupings would lead to different results.

Specifying a wildcard would leave the statement open to interpretation and unpredictable behaviour.

answered Apr 29 '09 at 04:49

womp

115,835
26
236
269

1

"open to interpretation" This is easily addressed by specifying the natural order of the columns - i.e. the order in which they are defined. A wildcard syntax would be a useful feature. – WestCoastProjects Dec 09 '18 at 12:23
1

As an implicit convention, it would *still* be unpredictable and dependant on the order specified in the `select` clause. As a feature might make sense, but am not sure it is compatible with SQL theoretical model. – Kamafeather Aug 31 '21 at 12:14

score 1 · Answer 6 · edited Mar 29 '15 at 20:23

1

No because this fundamentally means that you will not be grouping anything. If you group by all columns (and have a properly defined table w/ a unique index) then SELECT * FROM table is essentially the same thing as SELECT * FROM table GROUP BY *.

edited Mar 29 '15 at 20:23

Jeff

12,555
5
33
60

answered Apr 29 '09 at 04:46

jellomonkey

1,954
15
15

25

Of course, if you *don't* have a unique index, SELECT * FROM table is *not* the same as SELECT * FROM table GROUP BY *. In that case, you can accomplish this using SELECT DISTINCT * FROM table. – Sören Kuklau Apr 29 '09 at 05:01
8

Having duplicate rows is a common thing you have to deal when inheriting projects - so I don't think that it is safe to assume someone would never want to remove duplicate rows. – Elijah Apr 29 '09 at 05:43
6

There is another case... SELECT t1.*,count(t2.items) FROM t1 LEFT JOIN t2 ON t1.id = t2.id GROUP BY t1.* – Danny Dulai Jul 22 '16 at 22:27

score 0 · Answer 7 · answered Jul 07 '23 at 15:48

Some databases support GROUP BY ALL, such as Snowflake (see documentation here).

GROUP BY ALL

Specifies that all items in the SELECT list that do not use aggregate functions should be used for grouping.

It's great to simplify some queries.

Before:

SELECT col1, col2, col3, col4, col5, col6, col7, col8, col9, SUM(price)
FROM table
GROUP BY col1, col2, col3, col4, col5, col6, col7, col8, col9

AFTER:

SELECT col1, col2, col3, col4, col5, col6, col7, col8, col9, SUM(price)
FROM table
GROUP BY ALL

score -1 · Answer 8 · edited Aug 12 '17 at 14:46

Here is my suggestion:

DECLARE @FIELDS VARCHAR(MAX), @NUM INT

--DROP TABLE #FIELD_LIST

SET @NUM = 1
SET @FIELDS = ''

SELECT 
'SEQ' = IDENTITY(int,1,1) ,
COLUMN_NAME
INTO #FIELD_LIST
FROM Req.INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = N'new340B'

WHILE @NUM <= (SELECT COUNT(*) FROM #FIELD_LIST)
BEGIN
SET @FIELDS = @FIELDS + ',' + (SELECT COLUMN_NAME FROM #FIELD_LIST WHERE SEQ = @NUM)
SET @NUM = @NUM + 1
END

SET @FIELDS = RIGHT(@FIELDS,LEN(@FIELDS)-1)

EXEC('SELECT ' + @FIELDS + ', COUNT(*) AS QTY FROM [Req].[dbo].[new340B] GROUP BY ' + @FIELDS + ' HAVING COUNT(*) > 1  ')

score -3 · Answer 9 · answered Jul 31 '13 at 12:45

-3

You can use Group by All but be careful as Group by All will be removed from future versions of SQL server.

answered Jul 31 '13 at 12:45

Ravi

310
3
11

2

´Group by All´ has nothing to do with grouping by all columns ! – iDevlop Nov 08 '13 at 10:16

SQL grouping by all the columns

9 Answers9

The DISTINCT Keyword

Linked