1

I have a use case in which there is an existing Kinesis stream of which i am not aware of how many shards are there . However i have to create a consuming application which would consume from the shard and i also have to Dockerize the application.

I was having a look at this docker image link for doing the same : https://github.com/alexdebrie/kclpy

My question is that since i do not know how many shards are there so i spin up only one container which contains my consumer code. So in reality if my kinesis has 5 shards and i only spin up 1 container how would KCL handle the distribution ?

Would it create processes or threads for each shard ? Either ways how will docker handle creation of multiple process/threads ?

Can someone give me some hints since i am very new and could not learn a lot from the documentation.

Thanks in advance

Subhayan Bhattacharya
  • 5,407
  • 7
  • 42
  • 60

1 Answers1

1

KCL handles the reading from multiple shards all by itself, it uses dynamo db to keep track of shards and sequence_id.

You can run as many instances of KCL process, less than or equal to shard count.

user2679721
  • 49
  • 3
  • 7