0
  • I am working with Postgres 12.7.
  • I have a table with several tens of millions of rows. Within it there are three rows of bigint[]-arrays. I combine them to one array, lets call it bigint_array, which comprises around one to ten numbers.
  • I want to compare this array to a temporary table consisting of one column with 170k rows of bigint values. Let's call that one reference_column.
  • Ideally, I want to know how complete this intersection is. Whether the intersection is one hit, more than one hit or if the whole array is represented in the numbers of the column.

I read of the operator '&' which is supposed to compute the intersection of two arrays. Converting reference_column into an array and using the operator gives me the error "SQL Error [42883]: ERROR: operator does not exist: bigint[] & bigint[]". Hm.

If I want a preliminary result and use the && operator to see if there is any overlap between bigint_array and array_agg(reference_column) it apparently works, but takes hours and does not solve my problem of categorizing the completeness of the overlap.

Do you maybe have a hint? Is there a way to compute the intersection of an array with a column without converting that to an array?

I am aware of the lack of an MWG and will provide one after my working hours if necessary. I am taking a long time for creating them as I am rather inexperienced with SQL. I am hoping for a good hint without one but will update if necessary. Excuse that and thank you.

bfalk
  • 33
  • 4

1 Answers1

0

Based on the replies from this, quoting that one, you can use CROSS JOIN LATERAL the following way.

In your table with the reference_column, you can add the array you want to perform the intercection upon as a new column, with the same value across the rows.

Assuming that:

  • your bigint_array is obtained in a result called result1 as column called bigint_array
  • your table where the reference column is taken from is nammed table2, and the data is already formatted as an array :
SELECT *
FROM (
   SELECT
     (SELECT bigint_array FROM result1) AS array1
     , reference_column AS ref_array
     FROM table2
     ) AS O
     CROSS JOIN LATERAL 
     (
     SELECT COUNT(*) AS n_inter
     FROM 
     (
     SELECT UNNEST(array1)
     INTERSECT
     SELECT UNNEST (array2)
     ) AS U
) AS I
ORDER BY n_inter DESC