2

What are the settings that affect the number of TCP connections made to kafka? Background is that MSK IAM has a throttle limit.

Some things i can think of:

  1. max.tasks
  2. number of partitions
  3. number of brokers
  4. replication factor
tooptoop4
  • 234
  • 3
  • 15
  • 45
  • 1
    what solution did you end up with? We are fighting the issue of TCP connections as well, specifically new TCP / second when IAM Auth is enabled. For others reading this be very careful and thoughtful when enabling AWS IAM auth for MSK. It GREATLY limits the number of new connections you can up per second to a broker. It seems to be causing a nightmare for Kakfa Connect when a worker goes down because of auto-scaling down or EC2 instances going of service. https://docs.aws.amazon.com/msk/latest/developerguide/limits.html – Dude0001 Nov 15 '22 at 21:35
  • 1 task, 3 broker, 3 repl factor, often 1 partition, if our data is too large we go straight from source to s3 without msk – tooptoop4 Nov 15 '22 at 22:35

2 Answers2

1

There's no specific number.

For a rough estimate, from the Connect API, tasks.max is the only one above that is configurable that matters. Each task would start a set of consumer/producer instances, which only communicate with the leader partition.

Internally to the framework, there's data being produced and consumed between the Connect status, offsets, and config topics. By default, few of those have up to 50 partitions, meaning one connection for each.

After data reaches the leader partition, then it's replicated, per your factor, within the cluster (still over TCP).

Some source connectors may additionally create an AdminClient connection in order to create topics ahead of the writing the data.

Other connectors may use multiple topics for errors.tolerance dead-letter-queue, or more specific ones like confluent.license.topic, or Debezium's database history topic, or MirrorMaker2 heartbeat topic.

If you're using Confluent Schema Registry, then that also maintains a _schemas topic.

Then finally, Sink consumers will be writing to __consumer_offsets topic.


For some of these, increasing internal client configs, such as consumer max.poll.records or producer batch.size, will reduce the frequency of connections made, at the expense of potentially dropping/duplicating data during errors/rebalance

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
0

In my case we were seeing an error like below. It looks like we were getting a SASL token back from MSK, but getting throttled by EC2 instance metadata service retrieving AWS credentials to evaluate it. It turns out this is not retriable from the reconnect.backoff.max.ms and reconnect.backoff.ms logic of the Kafka client (https://kafka.apache.org/documentation/#producerconfigs_reconnect.backoff.ms) that the MSK documentation point you to for retrying because of MSK new TCP connection throttling when IAM authentication is enabled in your MSK cluster (https://docs.aws.amazon.com/msk/latest/developerguide/limits.html#msk-provisioned-quota)

We are using the Java library aws-msk-iam-auth. I found there is retry and exponetial backoff with jitter logic to account for these transient connectivity error fetching the AWS credentials on the client that requires some config to your JAAS config string.

sasl.jaas.config=software.amazon.msk.auth.iam.IAMLoginModule required awsMaxRetries="7" awsMaxBackOffTimeMs="500";

https://github.com/aws/aws-msk-iam-auth#retries-while-getting-credentials

org.apache.kafka.common.errors.SaslAuthenticationException: An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: Failed to find AWS IAM Credentials [Caused by aws_msk_iam_auth_shadow.com.amazonaws.SdkClientException: Unable to load AWS credentials from any provider in the chain: [aws_msk_iam_auth_shadow.com.amazonaws.auth.AWSCredentialsProviderChain@1d00a730: Unable to load AWS credentials from any provider in the chain: [EnvironmentVariableCredentialsProvider: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY)), SystemPropertiesCredentialsProvider: Unable to load AWS credentials from Java system properties (aws.accessKeyId and aws.secretKey), WebIdentityTokenCredentialsProvider: You must specify a value for roleArn and roleSessionName, software.amazon.msk.auth.iam.internals.EnhancedProfileCredentialsProvider@6dff3234: Profile file contained no credentials for profile 'default': ProfileFile(profiles=[]), aws_msk_iam_auth_shadow.com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper@7aaa946f: Too Many Requests (Service: null; Status Code: 429; Error Code: null; Request ID: null; Proxy: null)]]]) occurred when evaluating SASL token received from the Kafka Broker. Kafka Client will go to AUTHENTICATION_FAILED state.

I'm not clear if this is exactly what prompted the original question, but it brought me here and many other dead ends. Hopefully this helps someone else as the MSK documentation was only mentioning the Kafka Connect settings that were ineffective in this scenerio, and it took me a lot of time and frustration to discover the settings in the aws-msk-iam-auth library.

Dude0001
  • 3,019
  • 2
  • 23
  • 38