We have an aggregation system where the Aggregator is an KDA application running Flink which Aggregates the data over 6hrs time window and puts all the data into AWS Kinesis data Stream.
We also have an Consumer Application that uses KCL 2.x library and reads the data from KDS and puts the data into DynamoDB. We are using default KCL configuration and have set the poll time to 30 seconds. The issue that we are facing now is, Consumer application is reading all the data in KDS with in few minutes, causing huge writes in DynamoDB with in short period of time causing scaling issues in DynamoDB.
We would like to consume the KDS Data slowly and even out the data consumption across time allowing us to keep lower provisioned capacity for WCU's.
One way we can do that is increase the polling time for the KCL consumer application, I am trying to see if there is any configuration that can limit the number of records that we can poll, helping us to reduce the write throughput in dynamoDB or any other way to fix this problem?
Appreciate any responses