3

Suppose I have a table like this:

  link_ids  |  length
------------+-----------
 {1,4}      | {1,2}
 {2,5}      | {0,1}

How can I find the min length for each link_ids?

So the final output looks something like:

  link_ids  |  length
------------+-----------
 {1,4}      | 1
 {2,5}      | 0
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
jaynp
  • 3,275
  • 4
  • 30
  • 43
  • Why are your columns arrays? That looks like a really bad schema design. – Falmarri Feb 10 '15 at 00:05
  • To enlarge on @Falmarri 's comment - there is a set of formal rules (called normal forms) which describe how the schema of relational databases should look like in order to prevent lots of problems - it's considered sensible to conform to at least the first three - your schema doesn't conform to the first one since the cells of your table don't store atomic values. You should use tables to store lists. – lared Feb 10 '15 at 00:08
  • I'm familiar with normal forms. This is the result of a recursive query, that I'd like to further process. – jaynp Feb 10 '15 at 00:14
  • You should write your recursive query so that it gives you rows instead of arrays. – Falmarri Feb 10 '15 at 00:17
  • Your version of Postgres is essential to this question. Also: can columns be NULL? Can the array be empty? Is `link_ids` unique? Are arrays in ascending order like your example suggests? If you are working with an actual table, post the table definition. Else, it would be better to post your recursive query: there might be a better solution to begin with. (Plus table definitions for underlying tables.) – Erwin Brandstetter Feb 10 '15 at 00:37

5 Answers5

9

Assuming a table like:

CREATE TABLE tbl (
  link_ids int[] PRIMARY KEY     -- which is odd for a PK
, length int[]
, CHECK (length <> '{}'::int[] IS TRUE)  -- rules out null and empty in length
);

Query for Postgres 9.3 or later:

SELECT link_ids, min(len) AS min_length
FROM   tbl t, unnest(t.length) len  -- implicit LATERAL join
GROUP  BY 1;

Or create a tiny function (Postgres 8.4+):

CREATE OR REPLACE FUNCTION arr_min(anyarray)
  RETURNS anyelement LANGUAGE sql IMMUTABLE PARALLEL SAFE AS
'SELECT min(i) FROM unnest($1) i';

Only add PARALLEL SAFE in Postgres 9.6 or later. Then:

SELECT link_ids, arr_min(length) AS min_length FROM t;

The function can be inlined and is fast.

Or, for integer arrays of trivial length, use the additional module intarray and its built-in sort() function (Postgres 8.3+):

SELECT link_ids, (sort(length))[1] AS min_length FROM t;
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
3

Assuming that the table name is t and each value of link_ids is unique.

select link_ids, min(len)
from (select link_ids, unnest(length) as len from t) as t
group by link_ids;

 link_ids | min
----------+-----
 {2,5}    |   0
 {1,4}    |   1
ntalbs
  • 28,700
  • 8
  • 66
  • 83
2

A small addition to Erwin's answer - sometimes subquery with unnest can be even cheaper, than lateral join.

I used table definition from Erwin's answer and filled it:

t=# insert into t select '{1}'::int[]||g,'{1}'::int[]||g from generate_series(1,9999,1) g;
INSERT 0 9999
t=# select * from t order by ctid desc limit 1;
 link_ids |  length
----------+----------
 {1,9999} | {1,9999}
(1 row)

then analyze LATERAL JOIN:

t=# explain analyze select link_ids,max(r) from t, unnest(length) r where link_ids = '{1,9999}' group by 1;
                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=0.29..4.81 rows=1 width=33) (actual time=0.030..0.030 rows=1 loops=1)
   ->  Nested Loop  (cost=0.29..4.30 rows=100 width=33) (actual time=0.025..0.027 rows=2 loops=1)
         ->  Index Scan using t_pkey on t  (cost=0.29..2.30 rows=1 width=58) (actual time=0.015..0.016 rows=1 loops=1)
               Index Cond: (link_ids = '{1,9999}'::integer[])
         ->  Function Scan on unnest r  (cost=0.00..1.00 rows=100 width=4) (actual time=0.007..0.007 rows=2 loops=1)
 Total runtime: 0.059 ms
(6 rows)

and try the subquery:

t=# explain analyze select link_ids, (select max(r) from unnest(length) r) from t where link_ids = '{1,9999}';
                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 Index Scan using t_pkey on t  (cost=0.29..3.56 rows=1 width=58) (actual time=0.030..0.031 rows=1 loops=1)
   Index Cond: (link_ids = '{1,9999}'::integer[])
   SubPlan 1
     ->  Aggregate  (cost=1.25..1.26 rows=1 width=4) (actual time=0.011..0.011 rows=1 loops=1)
           ->  Function Scan on unnest r  (cost=0.00..1.00 rows=100 width=4) (actual time=0.008..0.008 rows=2 loops=1)
 Total runtime: 0.060 ms
(6 rows)

and lastly make sure result is the same:

t=# select link_ids, (select max(r) from unnest(length) r) 
from t 
where link_ids = '{1,9999}';
 link_ids | max
----------+------
 {1,9999} | 9999
(1 row)

t=# select link_ids,max(r) 
from t, unnest(length) r 
where link_ids = '{1,9999}' 
group by 1;
 link_ids | max
----------+------
 {1,9999} | 9999
(1 row)
Vao Tsun
  • 47,234
  • 13
  • 100
  • 132
2

For the min of array:

SELECT min(x) from unnest(array_name) as x;

Replace min with max to get the max.

frankhommers
  • 1,169
  • 12
  • 26
aabiro
  • 3,842
  • 2
  • 23
  • 36
0

(I'm gonna assume link_ids can have doubles and since there is no id column we're gonna improvise).

WITH r AS
(SELECT row_number() OVER() as id,
       link_ids,
       length from Table1)
SELECT DISTINCT ON (id) link_ids,
       unnest(length) 
FROM r 
ORDER BY id, length;

fiddle

Jakub Kania
  • 15,665
  • 2
  • 37
  • 47