2

We have a Couchbase DB and a spring boot application that interacts with it. The database has many types of documents amongst which we have order transaction documents. One of the queries is to get the Count of the number of Order transaction documents from the DB. The count of these documents in the DB is around 200 million and increasing. The query looks like the following :

SELECT COUNT(META(d).id) AS count FROM `orders` AS d WHERE META().id LIKE ':order:trx:%'

We are using the following index for this :

CREATE INDEX `idx_order_trx_meta_id` ON `orders`(meta().id) WHERE meta().id LIKE ':order:trx:%'

However this query times out even after adding an index on the bucket.

Is there a way to write efficient count queries for large document sets? Like for example use a view that keeps the count of this document or some other way?

humbleCoder
  • 667
  • 14
  • 27

1 Answers1

2

The count needs computed by scanning the index. You can increase index timeout if timeout is from indexer, otherwise query timeout or both. Also try second step suggested below.

In EE try the following query and see if it can use index aggregation. If required create partition index.

SELECT COUNT(1) AS count 
FROM `orders` AS d 
WHERE META().id LIKE ":order:trx:%";

Also try the following (store 1 character per document in index), No partition index. And see if it can use fast index count (As it is using complete index, if it used it should give in few ms as one lookup from stats, MB-34624)

CREATE INDEX `ix1` ON `orders`(SUBSTR(meta().id,0,1))
WHERE meta().id LIKE ':order:trx:%'

SELECT COUNT(1) AS count 
FROM `orders` AS d 
WHERE META().id LIKE ":order:trx:%" AND SUBSTR(meta().id,0,1) IS NOT MISSING;

Also if you don't need this programmatically use index stats or UI should give you number of indexed items

vsr
  • 7,149
  • 1
  • 11
  • 10