3

Is there an efficient way to query for array_agg on sparse data as follows? Rows with sparse elements need to get aggregated even if some are overlapped only partially.

Input

 ID  | tags
 ----|-------------
 a1  | {b1, b2}
 a2  | {b1, b3, b4}
 a3  | {b4, b5}
 a4  | {b6, b7}
 a5  | {b6, b8}

As you see, tags for a1, a2, and a3 do not overlap completely, but they should be aggregated for partial overlap. Since the data are sparse, the size of aggregate will not be very big.

Expected output

 IDs          | tags
 -------------|-------------
 {a1, a2, a3} | {b1, b2, b3, b4, b5}
 {a4, a5}     | {b6, b7, b8}

It would be nice if the query gets done over night or so for some 200 thousand rows of data.

Thank you for help.

Toshi

SG. Nihonbashi
  • 471
  • 1
  • 4
  • 7

0 Answers0