1

We have about 80M product in a bucket in couchbase and we need to daily read all data from it and do some calculations on them. I am using select queries with 512 limit on each select and a for while to read data from each collecion but sometimes I get timout error. Is there any better and efficient way to read data from couchbase ?

    for x_range in range(math.ceil(total_couchbase / buffer_size)):

        mycollection = couchbase_cluster.query(get_select_query(cluster_name, scope, collection_name, buffer_size, x_range * buffer_size))
        for current_document in mycollection:
             # do some calculations
Matthew Groves
  • 25,181
  • 9
  • 71
  • 121
Obtice
  • 1,205
  • 3
  • 18
  • 44
  • It depends on your calculations, but maybe look into the Analytics service - https://www.couchbase.com/products/analytics (which is great for complex queries that you don't need much concurrency on) and/or Spark connector - https://docs.couchbase.com/spark-connector/current/index.html (Spark is an Apache project for data engineering/data science stuff) – Matthew Groves Mar 29 '22 at 14:27
  • couchbase itself can not handle too much concurrent read/write requests. for example I have 80 collection in one bucket and I want to read data from one server for each collection and run all of 80 server toghether to read data from couchbase, but after some minutes, couchbase servers will be out of order. what is the best way to do this – Obtice Mar 29 '22 at 14:39
  • also, you might want to mention which version of Couchbase you're using – Matthew Groves Mar 29 '22 at 19:20
  • 1
    version 7.0.2 enterprise edition – Obtice Mar 29 '22 at 20:33
  • If you are just doing a calculation on individual documents a solution based on Eventing would be very fast because it is a DCP client. As an example look at the blog https://blog.couchbase.com/how-to-use-couchbase-xml-database/ where the "calculation" is did the XML change and an action is update a JSON equivelent. – Jon Strabala Apr 01 '22 at 14:34
  • We stream document changes with eventing and cURL POST to our business services which compute then write results back to couchbase – Siraf Aug 05 '22 at 14:19

0 Answers0