98

I'm selecting some objects and their tags in Postgres. The schema is fairly simple, three tables:

objects id

taggings id | object_id | tag_id

tags id | tag

I'm joining the tables like this, using array_agg to aggregate the tags into one field:

SELECT objects.*,
    array_agg(tags.tag) AS tags,
FROM objects
LEFT JOIN taggings ON objects.id = taggings.object_id
LEFT JOIN tags ON tags.id = taggings.tag_id

However, if the object has no tags, Postgres returns this:

[ null ]

instead of an an empty array. How can I return an empty array when there are no tags? I have double checked that I don't have a null tag being returned.

The aggregate docs say "The coalesce function can be used to substitute zero or an empty array for null when necessary". I tried COALESCE(ARRAY_AGG(tags.tag)) as tags but it still returns an array with null. I have tried making the second parameter numerous things (such as COALESCE(ARRAY_AGG(tags.tag), ARRAY()), but they all result in syntax errors.

Andy Ray
  • 30,372
  • 14
  • 101
  • 138
  • 3
    IMHO aggregation should return empty array, not sure why they decided to return `null`. There may be a reason, but something returning an array should not return `null`. – Christophe Roussy Mar 01 '19 at 19:40

7 Answers7

87

Another option might be array_remove(..., NULL) (introduced in 9.3) if tags.tag is NOT NULL (otherwise you might want to keep NULL values in the array, but in that case, you can't distinguish between a single existing NULL tag and a NULL tag due to the LEFT JOIN):

SELECT objects.*,
     array_remove(array_agg(tags.tag), NULL) AS tags,
FROM objects
LEFT JOIN taggings ON objects.id = taggings.object_id
LEFT JOIN tags ON tags.id = taggings.tag_id

If no tags are found, an empty array is returned.

Thomas Perl
  • 2,178
  • 23
  • 20
  • 1
    I chose this answer, perhaps unfairly to the others, because it involves much less query modification, and I don't care about the null tag case here. – Andy Ray Feb 15 '16 at 05:59
  • This is the answer I used too, but for those wishing to learn more about the 'why', see Patrick's answer below along with the aggregate function documentation https://www.postgresql.org/docs/9.5/static/functions-aggregate.html – Tom Gerken Dec 28 '16 at 18:50
  • Any way to use this with jsob_agg? – pdowling Apr 24 '19 at 15:54
  • Since `NULL` tags are very likely to be invalid, I also think that this is a fine answer. – Bart Hofland Dec 13 '19 at 10:36
70

Since 9.4 one can restrict an aggregate function call to proceed only rows that match a certain criterion: array_agg(tags.tag) filter (where tags.tag is not null)

Alexey Bashtanov
  • 1,274
  • 10
  • 7
  • 2
    If all `tags.tag` are null, this returns `null`, not an empty array. Is there a way to return an empty array by default? – tim-phillips May 16 '19 at 00:09
  • 34
    `coalesce(array_agg(tags.tag) filter (where tags.tag is not null), '{}')` – Alexey Bashtanov May 17 '19 at 18:32
  • From @AlexeyBashtanov's comment, if you want to fall back to an empty array instead of an empty object, just switch `'{}'` to `'[]'`. – bsplosion Feb 15 '23 at 19:21
  • @bsplosion Postgres array literals are `'{}'` – xlm Mar 08 '23 at 23:28
  • @xlm that's odd - it doesn't work that way for me on Postgres 14 (it returns an object literal with `'{}'`). – bsplosion Mar 09 '23 at 14:27
  • 1
    @bsplosion maybe you have an extension installed. From [docs](https://www.postgresql.org/docs/14/arrays.html#ARRAYS-INPUT): _To write an array value as a literal constant, enclose the element values within curly braces and separate them by commas_ – xlm Mar 09 '23 at 21:22
22

The docs say that when you are aggregating zero rows, then you get a null value, and the note about using COALESCE is addressing this specific case.

This does not apply to your query, because of the way a LEFT JOIN behaves - when it finds zero matching rows, it returns one row, filled with nulls (and the aggregate of one null row is an array with one null element).

You might be tempted to blindly replace [NULL] with [] in the output, but then you lose the ability to distiguish between objects with no tags and tagged objects where tags.tag is null. Your application logic and/or integrity constraints may not allow this second case, but that's all the more reason not to suppress a null tag if it does manage to sneak in.

You can identify an object with no tags (or in general, tell when a LEFT JOIN found no matches) by checking whether the field on the other side of the join condition is null. So in your case, just replace

array_agg(tags.tag)

with

CASE
  WHEN taggings.object_id IS NULL
  THEN ARRAY[]::text[]
  ELSE array_agg(tags.tag)
END
Nick Barnes
  • 19,816
  • 3
  • 51
  • 63
  • 2
    I think this is a better explanation and answer, except I note that it requires `taggings.object_id` to be added to a `GROUP BY` clause in order to avoid syntax error `ERROR: 42803: column "taggings.object_id" must appear in the GROUP BY clause or be used in an aggregate function` -- does adding this clause alter the final results at all? – user9645 Jan 10 '17 at 17:22
  • 1
    @user9645: Assuming that the original query had a `GROUP BY objects.id` (which is needed to avoid this same error), changing it to `GROUP BY objects.id, taggings.object_id` won't affect the grouping (the `JOIN` condition ensures that a given `objects.id` value can never be associated with multiple distinct `taggings.object_id` values). – Nick Barnes Jan 11 '17 at 03:46
  • Nick - Thanks I thought so but was not positive. – user9645 Jan 11 '17 at 20:29
4

The documentation says that an array containing NULL is returned. If you want to convert that to an empty array, then you need to do some minor magic:

SELECT objects.id,
    CASE WHEN length((array_agg(tags.tag))[1]) > 0
    THEN array_agg(tags.tag) 
    ELSE ARRAY[]::text[] END AS tags
FROM objects
LEFT JOIN taggings ON objects.id = taggings.object_id
LEFT JOIN tags ON tags.id = taggings.tag_id
GROUP BY 1;

This assumes that the tags are of text type (or any of its variants); modify the cast as required.

The trick here is that the first (and only) element in a [NULL] array has a length of 0, so if any data is returned from tags you return the aggregate, otherwise construct an empty array of the right type.

Incidentally, the statement in the documentation about using coalesce() is a bit crummy: what is meant is that if you do not want NULL as a result, you can use coalesce() to turn that into a 0 or some other output of your choosing. But you need to apply that to the array elements instead of the array, which, in your case, would not provide a solution.

Patrick
  • 29,357
  • 6
  • 62
  • 90
  • 1
    Yep, you actually want the reverse of `coalesce`, `nullif`, if the question really is as it appears. – Craig Ringer Jun 29 '15 at 06:58
  • `length(NULL)` is `NULL`, not `0`. `length(NULL) > 0` is also `NULL`, which happens to fall through to the `ELSE` case. But so does `length('') > 0`, and I don't think this is the desired behaviour. – Nick Barnes Jun 29 '15 at 14:15
  • https://www.postgresql.org/docs/9.5/static/functions-aggregate.html It's kind of buried in the documentation but the relevant text is "It should be noted that except for count, these functions return a null value when no rows are selected. In particular, sum of no rows returns null, not zero as one might expect, and array_agg returns null rather than an empty array when there are no input rows." – Tom Gerken Dec 28 '16 at 18:48
3

What about this:

COALESCE(NULLIF(array_agg(tags.tag), '{NULL}'), '{}') AS tags,

Seems to work.

user9645
  • 6,286
  • 6
  • 29
  • 43
1

Perhaps this answer is a little late, but I wanted to share with you that another querying strategy is possible as well: performing the aggregation in a a separate (common) table expression.

WITH cte_tags AS (
  SELECT
    taggings.object_id,
    array_agg(tags.tag) AS tags
  FROM
    taggings
    INNER JOIN tags ON tags.id = taggings.tag_id
  GROUP BY
    taggings.object_id
)
SELECT
  objects.*,
  cte_tags.tags
FROM
  objects
  LEFT JOIN cte_tags ON cte_tags.object_id = objects.id

Instead of an array with a single element of NULL, you will now get NULL instead of an array.

If you really want an empty array instead of NULL in your results, you can use the COALESCE function...:

WITH cte_tags AS (
  SELECT
    taggings.object_id,
    array_agg(tags.tag) AS tags
  FROM
    taggings
    INNER JOIN tags ON tags.id = taggings.tag_id
  GROUP BY
    taggings.object_id
)
SELECT
  objects.*,
  COALESCE(cte_tags.tags, '{}') AS tags
FROM
  objects
  LEFT JOIN cte_tags ON cte_tags.object_id = objects.id

...or use array-to-array concatenation:

WITH cte_tags AS (
  SELECT
    taggings.object_id,
    array_agg(tags.tag) AS tags
  FROM
    taggings
    INNER JOIN tags ON tags.id = taggings.tag_id
  GROUP BY
    taggings.object_id
)
SELECT
  objects.*,
  cte_tags.tags || '{}' AS tags
FROM
  objects
  LEFT JOIN cte_tags ON cte_tags.object_id = objects.id
Bart Hofland
  • 3,700
  • 1
  • 13
  • 22
-2

I replaced

array_to_json(array_agg(col_name))

with

array_to_json(coalesce(array_agg(col_name), ARRAY[]::record[]))

so that instead of returning a null JSON value I got an empty JSON array

johnleuner
  • 27
  • 2