0

My application is consuming from one stream and then pushing the messages on three streams

Binder:

public interface MyBinder {

  @Input("input1")
  SubscribableChannel input1();

  @Output("output1")
  MessageChannel output1();

  @Output("output2")
  MessageChannel output2();

  @Output("output3")
  MessageChannel output3();


}

Config:

spring:
  cloud:
    stream:
      kinesis:
        binder:
          locks:
            leaseDuration: 30
            refreshPeriod: 3000
        bindings:
          input1:
            consumer:
              listenerMode: batch
              recordsLimit: 1500
              idleBetweenPolls: 10000
              consumer-backoff: 1000
      bindings:
        input1:
          group: my-group
          destination: input1-stream
          content-type: application/json
        output1:
          destination: output1-stream
          content-type: application/json
        output2:
          destination: output2-stream
          content-type: application/json
        output3:
          destination: output3-stream
          content-type: application/json

The data we push to stream in each record around 800kb. We see that there are more data in AbstractAwsMessageHandler/AmazonKinesisAsyncClient which is leading very frequent GC flush.

We are using 1.0.0.RELEASE version of Binder

Can you please help.

Patan
  • 17,073
  • 36
  • 124
  • 198

1 Answers1

2

Only what I can say by your configuration that you are going to have 1500 * 3 PutRecordRequest instances on your AbstractAwsMessageHandler and since it is in async mode by default you may lead to the queue overhead waiting for the AWS service to handle them.

You may consider to decrease that recordsLimit or configure all the producers to be in a sync mode: https://github.com/spring-cloud/spring-cloud-stream-binder-aws-kinesis/blob/master/spring-cloud-stream-binder-kinesis-docs/src/main/asciidoc/overview.adoc#kinesis-producer-properties

In case of less records to consume you are going to have less objects in the memory. In case of sync producing mode you are going to block consumer thread, so it is not going to pull more records from input stream.

Artem Bilan
  • 113,505
  • 11
  • 91
  • 118
  • Thank you for helping out. I will try out options and come back – Patan Apr 17 '19 at 17:04
  • Is there any way that we can restrict the number of PutRecordRequest through some config? – Patan Apr 18 '19 at 04:01
  • It is restricted via throughput on client level anyway, but since it in the async mode, all the records just queued in the memory. The best way to restrict number of records to poll upstream or perform put in a sync mode to block consumer on the same thread – Artem Bilan Apr 18 '19 at 12:26
  • Can you take a look at this and suggest us https://stackoverflow.com/questions/55877979/idlebetween-pools-not-pulling-messages-as-specified – Patan May 01 '19 at 04:24