GROUP BY and DISTINCT ON working differently for tables and views

Question

I have a table Design and a view on that table called ArchivedDesign. The view is declared as:

CREATE OR REPLACE VIEW public."ArchivedDesign" ("RootId",  "Id", "Created", "CreatedBy", "Modified", "ModifiedBy", "VersionStatusId", "OrganizationId")
AS
SELECT DISTINCT ON (des."RootId") "RootId", des."Id", des."Created", des."CreatedBy", des."Modified", des."ModifiedBy", des."VersionStatusId", des."OrganizationId"
FROM public."Design" AS des
JOIN public."VersionStatus" AS vt ON des."VersionStatusId" = vt."Id"
WHERE vt."Code" = 'Archived'
ORDER BY "RootId", des."Modified" DESC;

Then, I have a large query which gets a short summary of latest changes, thumbnails, etc. The whole query is not important, but it contains two almost identical subqueries - one for the main table and and one for the view.

SELECT DISTINCT ON (1) x."Id",
                TRIM(con."Name") AS "Contributor",
                extract(epoch from x."Modified") * 1000 AS "Modified",
                x."VersionStatusId",
                x."OrganizationId"
FROM public."Design" AS x
JOIN "Contributor" AS con ON con."DesignId" = x."Id"
WHERE x."OrganizationId" = ANY (ARRAY[]::uuid[])
AND x."VersionStatusId" = ANY (ARRAY[]::uuid[])
GROUP BY x."Id", con."Name"
ORDER BY x."Id";

and

SELECT DISTINCT ON (1) x."Id",
                TRIM(con."Name") AS "Contributor",
                extract(epoch from x."Modified") * 1000 AS "Modified",
                x."VersionStatusId",
                x."OrganizationId"
FROM public."ArchivedDesign" AS x
JOIN "Contributor" AS con ON con."DesignId" = x."Id"
WHERE x."OrganizationId" = ANY (ARRAY[]::uuid[])
AND x."VersionStatusId" = ANY (ARRAY[]::uuid[])
GROUP BY x."Id", con."Name"
ORDER BY x."Id";

Link to SQL fiddle: http://sqlfiddle.com/#!17/d1d0f/1

The query is valid for the table, but fails for the view with an error column x."Modified" must appear in the GROUP BY clause or be used in an aggregate function. I don't understand why there is a difference in the behavior of those two queries? How do I fix the view query to work the same way as the table query?

My ultimate goal is to replace all table sub-queries with view sub-queries so we can easily separate draft, active and archived designs.

is it possible to create a reproducible example? I'd suggest you remove all non-important information and try to get minimal viable example. The best would be to share such an example on dbfiddle.uk or sqlfiddle.com — Roman Pekar, Jun 03 '20 at 08:23
Unrelated to your problem, but: you should really avoid those dreaded quoted identifiers. They are much more trouble than they are worth it. https://wiki.postgresql.org/wiki/Don%27t_Do_This#Don.27t_use_upper_case_table_or_column_names — , Jun 03 '20 at 08:26
@RomanPekar - I have added the link http://sqlfiddle.com/#!17/d1d0f/1 — Storm, Jun 03 '20 at 08:52
@a_horse_with_no_name, please address your comments to the NPGSQL team (https://www.npgsql.org/). The database schema is generated from their Entity Framework extension. — Storm, Jun 03 '20 at 08:54

score 2 · Accepted Answer · answered Jun 03 '20 at 09:03

You get that error because when you query the table directly, Postgres is able to identify the primary key of the table and knows that grouping by it is enough.

Quote from the manual

When GROUP BY is present, or any aggregate functions are present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions or when the ungrouped column is functionally dependent on the grouped columns, since there would otherwise be more than one possible value to return for an ungrouped column. A functional dependency exists if the grouped columns (or a subset thereof) are the primary key of the table containing the ungrouped column

^{(emphasis mine)}

When querying the view, Postgres isn't able to detect that functional dependency that makes it possible to have a "shortened" GROUP BY when querying the table directly.

good answer, I was just figuring it out - here's a link for OP to test it - http://sqlfiddle.com/#!17/d1d0f/10. IF you remove the line which adds primary key it doesn't work, with primary key it does. — Roman Pekar, Jun 03 '20 at 09:06
But the queries will be functionally the same (i.e. returning the same results) when I add the remaining columns to the `GROUP BY` statement, right? — Storm, Jun 03 '20 at 10:07

GROUP BY and DISTINCT ON working differently for tables and views

1 Answers1