Column must appear in the GROUP BY clause or be used in an aggregate function (PostgreSQL)

Question

I have this database of bank information:

  id    | date     |     asset           
--------+----------+---------------
 1      | 6/30/2001|    333860 
 1      | 3/31/2001|    336896
 1      | 9/30/2001|    349343
 2      | 6/30/2001|    451297
 2      | 3/31/2001|    411421
 2      | 9/30/2001|    430178
 3      | 6/30/2001|    106506
 3      | 3/31/2001|    104196
 3      | 9/30/2001|    106383

I am trying to display the ID for the bank with the second largest total asset. Here is the code I used (similar to this question)

SELECT DISTINCT(id), SUM(asset) AS mv 
FROM bank2001 
WHERE asset NOT IN (SELECT MAX(asset) FROM bank2001)

When I ran the query, I received the following error:

ERROR: column "bank2001.id" must appear in the GROUP BY clause or be used in an aggregate function LINE 1: SELECT DISTINCT(id), MAX(asset) AS mx

I have looked up this error but I can't find anything on how it relates to this problem since I do not have GROUP BY in my code. What does this error mean and how can I go about fixing this issue?

Note: The code should have returned the id of 1.

*I do not have GROUP BY in my code.*: that's what the error is complaining about. You must have one, and the ID must be listed in this group by clause. — JB Nizet, Nov 06 '18 at 00:46
@dwir182 I forgot to add the parentheses when I copied my code here. Thanks for pointing that out. — Ian97, Nov 06 '18 at 00:48

dwir182 · Answer 1 · 2018-11-06T01:11:17.647

0

First Distinct are not Aggregate Function.. It will remove duplicates but not do Aggregate. And you have Sum() function which is Aggregate so it need to Group By

SELECT 
    DISTINCT(id), SUM(asset) AS mv 
FROM 
    bank2001 
WHERE 
    asset NOT IN (SELECT 
                     MAX(asset) 
                  FROM 
                     bank2001)
GROUP BY
     id

Here the result : Fiddle

EDIT :

SELECT 
    id, SUM(asset) AS mv 
FROM 
    bank2001 
WHERE 
    asset NOT IN (SELECT 
                     MAX(asset) 
                  FROM 
                     bank2001)
GROUP BY
     id

As in comment.. Group will show per ID so Distinct you don't need it..

You can see the same result : Fiddle

edited Nov 06 '18 at 01:11

answered Nov 06 '18 at 00:50

dwir182

1,539
10
20

You can remove the distinct, because the group by will only return one row per ID anyway. – DancingFool Nov 06 '18 at 01:07
Agree.. But i think the op problem about `distinct`.. But i will edit my answer.. Thanks.. :) – dwir182 Nov 06 '18 at 01:08
Thanks for the explanation @dwir182 and @DancingFool. It really helped me to understand the error. One follow up question. If I wanted to show only the ID for the second highest bank, is it possible to put an `ORDER BY` statement followed up `desc offset 1 limit 1` somewhere in the code? – Ian97 Nov 06 '18 at 01:33

score 0 · Answer 2 · answered Nov 06 '18 at 01:31

Apart from the Group by issue you're having, this query is not going to do what you want anyway, because this bit

WHERE asset NOT IN (SELECT MAX(asset) FROM bank2001)

is just removing the fourth line from your database. You need to treat this as two separate tasks - first find the total amount each bank has, then find the second largest one of those sums. Something like this (there are other ways of doing it, but this is closest to your original query)

--This part gets the asset totals for each bank - returns one row per bank
WITH BankAssets as (
    Select id, SUM(asset) AS assetTotal
    FROM bank2001
    GROUP BY id
)
-- This part returns the top bank that does not have the maximum asset sum
Select id
From BankAssets
WHERE assetTotal NOT IN (SELECT MAX(assetTotal) FROM BankAssets) --skip any banks with the top amount
ORDER BY assetTotal DESC  --order in descending order
LIMIT 1  --only return one row

Note that if the top two banks by asset total had the same amount, this would get the third one.

score 0 · Accepted Answer · answered Nov 06 '18 at 01:32

A table and some rows to start. (You should include this in your question.)

create table bank2001 (
  id integer,
  date date,
  asset integer,
  primary key (id, date)
  );

  insert into bank2001 values
 (1, '6/30/2001',    333860), 
 (1, '3/31/2001',    336896),
 (1, '9/30/2001',    349343),
 (2, '6/30/2001',    451297),
 (2, '3/31/2001',    411421),
 (2, '9/30/2001',    430178),
 (3, '6/30/2001',    106506),
 (3, '3/31/2001',    104196),
 (3, '9/30/2001',    106383);

This query returns the total assets per id.

select id, sum(asset) as total_assets
from bank2001
group by id
order by total_assets desc

id  total_assets
2   1292896
1   1020099
3   317085

I am trying to display the ID for the bank with the second largest total asset.

It's clear that the right answer is 1. You're trying to return the id for the second row. One way is to tell PostgreSQL you want to get back one row, and you want to skip the first row to get it.

with total_assets as (
  select id, sum(asset) as total_assets
  from bank2001
  group by id
)
select id
from total_assets
order by total_assets desc
limit 1 offset 1;

Column must appear in the GROUP BY clause or be used in an aggregate function (PostgreSQL)

3 Answers3