1

Consider this example:

SELECT    comment_date
        , array_agg(user_id) users
        , array_agg(comment) comments
FROM      user_comments
GROUP BY  comment_date
  1. Is it safe to assume that the indexes of users and comments refer to the same record (e.g., users[3] created comments[3])?
  2. Is it possible that the order of the two arrays may refer to different records, possibly due to performance enhancements?

I don't know enough about the internals of Postgres to trust array_agg ordering.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
vol7ron
  • 40,809
  • 21
  • 119
  • 172

1 Answers1

0

Unless you explicitly specify it, you cannot assume anything about the order an aggregate function is applied. If you want to ensure that two calls to array_agg have corresponding values you should add the order by clause to both of them. E.g.:

SELECT    date
        , array_agg(user_id ORDER BY user_id) users
        , array_agg(comment ORDER BY user_id) comments
FROM      user_comments
GROUP BY  date
Mureinik
  • 297,002
  • 52
  • 306
  • 350
  • Do you know this for certain? The query planner doesn't see two *array_agg*s and populate both when looking at the record? – vol7ron Jul 15 '15 at 15:16
  • It may (and probably should), but nowhere in postgres' documentation is there any guarantee of this. In my book, if it's not guaranteed at that level it means it may very well be changed for some optimization, so I wouldn't write code that assumes this. – Mureinik Jul 15 '15 at 15:28