1

We are building a service based on Kinesis / DynamoDB streams and we have the following question about the behavior of the checkpoints.

We have a worker that starts with the following configuration withInitialPositionInStream (InitialPositionInStream.LATEST) and the name of the KCL application is always the same.

What we have observed by turning the worker off and on again is that it does not start to consume from the end of the stream, since we have a lag metric and we see that when the worker is turned on the consumption lag is hours, when we expect it to be less of 1 second since they are messages that we produce at the moment.

  • Is this an expected behavior?
  • Are we misinterpreting how the LATEST works?

Thank you very much.

Joaquín Fernández
  • 147
  • 1
  • 4
  • 19
  • Could you please clarify what version of kinesis library you use ? Also, are you checkpointing manually or not ? Thank you. – Mikalai Lushchytski Oct 05 '20 at 13:18
  • Hi Mikalai, I’m using version 2.2.11 of KCL in java, we perform checkpoints manually in the `processRecords` but I think KCL performs checkpoints automatically after `processRecords`. – Joaquín Fernández Oct 05 '20 at 17:33
  • In general, LATEST is taken into account only on initial startup when there is no lease table. If you restart the application with the same app id, kcl will resume from the latest set num for each shard. It does not checkpoint automatically as far as I know, in contrast with for example kafka consumer. You have to checkpoint manually. – Mikalai Lushchytski Oct 05 '20 at 17:44

1 Answers1

1

As the documentation for InitialPositionInStream states,

Used to specify the position in the stream where a new application should start from. This is used during initial application bootstrap (when a checkpoint doesn't exist for a shard or its parents).

So, it's used only during initial new application bootstrap and in case of LATEST, it starts after the most recent data record. But only when a checkpoint doesn't exist for a shard or its parents.

So, if you turn your worker off and then turn it on again, it's not expected to start from LATEST anymore but instead it starts from the last checkpointed sequence number for a shard.

KCL does not checkpoint automatically and thus if your worker starts with an hours lag means that probably you checkpoint too rare.

Mikalai Lushchytski
  • 1,563
  • 1
  • 9
  • 18