0

I am trying to optimize a join query which is taking more time when the offset increases. Your suggestions will be helpful.

SELECT  *
FROM    document_1 AS doc_1
JOIN    document_1 AS doc_2 
ON      doc_1.id_col_11 = doc_2.id_col_21
AND     doc_1.id_col_12 = doc_2.id_col_22
WHERE   doc_1.id_col_13 = 'ABC'
AND     doc_1.id_col_14= 'XYZ'
ORDER BY doc_1.created_date, document_1.id
OFFSET 0
LIMIT 20

I have created indexes as below:

CREATE idx_1 ON document_1 (id_col_11, id_col_13, id_col_14);
CREATE idx_2 ON document_1 (id_col_21, id_col_22);

enter image description here

OFFSET LIMIT TIME
0      100    5.9s 
10     100    6.0s 
1000   100    5.9s 
10     1000  16.5s 
100    500   11.4s

When the offset increased, the performance dropped. I am using Spring data to connect to a Couchbase database.

rgettman
  • 176,041
  • 30
  • 275
  • 357
stk
  • 1
  • 1

1 Answers1

0

The numbers you mentioned doesn't reflect anything in drop if offset increase. What it shows is drop in when LIMIT increases. That is expected because number of documents and size increases and those need to transfer to client and client need to consume.

Also you need where clause filters in FIRST if possible so that those can be pushed to indexer eliminated early. https://www.couchbase.com/blog/ansi-join-support-n1ql/

Once you have ORDER BY it needs to produce all possible results (last item in the JOIN can be first in the order).

When OFFSET+LIMIT < 8192 it uses heap sort and discard entries early vs full blown sort

Impact of OFFSET/LIMIT on performance discussed here https://www.couchbase.com/blog/offset-keyset-pagination-n1ql-query-couchbase/ . This will not applicable because of JOIN

See if you can use left side as subquery with Couchbase sorting records with Keyset pagination - missing records and get it work

vsr
  • 7,149
  • 1
  • 11
  • 10