- How much RAM do each of these processes use by default?
By default:
- yb-tserver process assumes 85% of node's RAM is available for its use.
and
- yb-master process assumes 10% of node's RAM is available for its use.
These are determined by default settings of the gflag
--default_memory_limit_to_ram_ratio
(0.85 and 0.1 respectively for
yb-tserver/yb-master).
- Is there a way to explicitly control this?
Yes, there are 2 different options for controlling how much memory is allocated to the processes yb-master and yb-tserver:
Option A) You can set --default_memory_limit_to_ram_ratio
to control
what percentage of node's RAM the process should use.
Option B) You can specify an absolute value too using
--memory_limit_hard_bytes
. For example, to give yb-tserver 32GB of
RAM, use:
--memory_limit_hard_bytes 34359738368
Since you start these two processes independently, you can use either option for yb-master or yb-tserver. Just make sure that you don't oversubscribe total machine memory since a yb-master and a yb-tserver process can be present on a single VM.
- Presumably, the memory is used for caching, memtables etc. How are
these components sized?
Yes, the primary consumers of memory are the block cache, memstores &
memory needed for requests/RPCs in flight.
Block Cache:
--db_block_cache_size_percentage=50 (default)
Total memstore is the minimum of these two knobs:
--global_memstore_size_mb_max=2048
--global_memstore_size_percentage=10
- Can specific tables be pinned in memory (or say given higher
priority in caches)?
We currently (as of 1.1) do not have per-table pinning hints yet.
However, the block cache does do a great job already by default of
keeping hot blocks in cache. We have enhanced RocksDB’s block cache to
be scan resistant. The motivation was to prevent operations such as
long-running scans (e.g., due to an occasional large query or
background Spark jobs) from polluting the entire cache with poor
quality data and wiping out useful/hot data.