Does Couchbase store documents in-memory first before moving the data to filestore? Is there any configuration available to specify how long the data has to be store in-memory before it can be flushed to file store?
2 Answers
Couchbase architecture is Memory first\Cache thru. You can't decide if using memory or not, and it write the data to disk as soon as possible. Part of that is that you need to have enough memory for the amount of data you have.
You do have some policies like Full or Value eviction but again you don't have the control.
But what you can do is in the SDK wait until the data is replicated\persisted to disk.

- 575
- 4
- 14
-
Thanks Roi. For instance if I have 100 GB of memory , is the data retained in memory until 100 GB limit is reached before it is flushed out the disk? – Punter Vicky Mar 07 '16 at 17:30
-
1Not exactly. There is what it's called high water mark (let's say 80% of allocated memory for the bucket), after you reach it, depending on the eviction policy either the full document (meta+data) are evicted or just the data. – Roi Katz Mar 08 '16 at 06:29
-
1You can also tune the bucket to only take part of the 100GB of RAM. So let's say you have 100GB of data, but only really need 50GB of it in RAM because of the DB usage patterns. You can tune the cache for that. The other thing is that Couchbase will try and keep replica data in the cache too. So if there is a failover event, that data is ready to go. If Couchbase comes under memory pressure, it will eject the replica data first though. You can keep more data on disk than you have in the cache is nice too. – NoSQLKnowHow Mar 08 '16 at 16:06
-
Thanks Kirk! Does the data get evicted to disk only if there is a memory pressure or is it stored in the disk immediately and also retained in the RAM? – Punter Vicky Mar 08 '16 at 16:44
-
@PunterVicky Data is only ejected from the cache if there is memory pressure, but the data is always on disk as well. So if there is a need for that data, Couchbase will go to disk, place it in the managed cache and then service the request. – NoSQLKnowHow Mar 09 '16 at 18:25
Couchbase stores data both on disk and in RAM. The default behavior is to write the document to disk at some arbitrary time (usually quickly) after storing in RAM. This leaves a short window where node failure can result in loss of data. I can't find anything in the documentation for the current version of Couchbase, but it used to be that you could request the "set" method to only complete once the data has been persisted to disk (default is to RAM only).
In any case, after writing to RAM, the document will eventually be written to disk. Couchbase keeps a disk write queue
which you can check on the metrics report page in the management console. Now, CB does synchronize writes across the cluster, and I believe a write will be synchronized across a cluster before Couchbase will acknowledge that the write happened (e.g. before the write method returns to the caller). Again, the documentation is hard to determine on this, as prior versions the documentation was much more detailed.
If you have more documents than available RAM, only the most-frequently accessed documents will be stored in RAM for quick retrieval, with all others being "evicted" to disk.

- 15,456
- 7
- 58
- 90
-
Couchbase does not write to disk "at some arbitrary time". It writes to disk as fast as resources will allow it. Yes there are some OS level queues for disk, but that is unavoidable. Also to clarify, Couchbase does not just write data to RAM, but a managed cache. Specifically it uses hybrid memcached for this. There is more to it than this, but you get the idea. – NoSQLKnowHow Mar 08 '16 at 16:01
-
@Kirk - thank you for the clarification - to be clear, the word *arbitrary* means in the mathematical sense "non-deterministic or unspecified" as opposed to without logic. And yes, while it uses memcached, the gist is that the data is stored in RAM as opposed to on slower, physical media. – theMayer Mar 08 '16 at 19:23