How to use streams with Scylla-Alternator? What is best practice?

Question

The streams description of my table returns 2048 shards with Scylla-Alternator. With DynamoDB I get only one. For DynamoDB I have poll the one shard stream once per second. If I do the same with Scylla-Alternator this results in 2048 requests per seconds. This seams not to be a good idea.

Because the shards can be expired and new can be sporn you must also query for new shards.

This all sounds a very large overhead for an event every few minutes. What is the best practice to get the events without produce to many traffic?

score 1 · Accepted Answer · answered Aug 31 '22 at 11:56

Your observation is correct: ScyllaDB's CDC (which Alternator Streams is based on) divides the data into a very large number of "shards" (confusingly, in ScyllaDB terminology, the word "shard" has a different meaning and the word "streams" is used).

The reason for this design decision was that whereas DynamoDB chose to reduce the number of shards to make reading the change log easier and more efficient, ScyllaDB chose to make the writing more efficient. This is important to Scylla because in ScyllaDB cheap writes are much cheaper than in DynamoDB (in ScyllaDB, non-LWT writes have roughly the same performance as reads, whereas in DynamoDB writes are at least 5 times more expensive than reads).

To make writing the change log more efficient, ScyllaDB wants them to be "local" - if a piece of data is on a specific CPU of a specific node, we want its change log to be on the same CPU. Moreover, if a piece of data has a specific triplet of replicas, we want its change log to be on the same replicas. With all of this together, we end up with the number of stream "shards" (in DynamoDB terminology) numbering vnodes * cpus_per_node Where:

vnodes is the number of token ranges that the token ring is split to and determines the replication. There is a configuration paramter num_tokens which controls how many vnode token each of the ScyllaDB server picks, so if there are N servers, then vnodes = num_tokens * N.
cpus_per_node is the number of CPUs in one node

One way to reduce the total vnodes * cpus is to reduce the num_tokens configuration parameter. To improve load balancing, it usually defaults to 256, but in some cases (e.g., when you have just one node, or have three nodes and RF=3) you can successfully reduce it even to 1. So for example if you have one node with 8 CPUs, vnodes * cpus will be by default 256 * 8, i.e., 2048 as you noticed, but if you set num_tokens = 1 you'll get just 8 shards. Not as good as 1, but not as bad as 2048.

Unfortunately, if you can't reduce the number of shards, you do need to poll a lot more shards. This usually has two consequences:

You will probably poll the shards less frequently than you do in DynamoDB. If, like you said, an event only happens "once every few minutes", it might not be a problem for you to only poll each shard once every minute. Whether or not a one-minute delay in an event that only happens once every few minutes matters to your application depends on the application, of course.
As I noted above, yes - reads of the change log will indeed be more costly for your application than they are in DynamoDB. But on the other hand, writes (without LWT) will be less costly. So depending on your application, this might be either a good thing, or bad. For example, an application that does a lot of writes will enjoy the write speedup but the polling overhead will be minimal (because every poll will return a bunch of results, with hardly any read work wasted on empty results).

Rare events does not means that I can wait for long time. – Horcrux7 Sep 01 '22 at 18:28 — Horcrux7, Sep 01 '22 at 18:28

How to use streams with Scylla-Alternator? What is best practice?

1 Answers1