5

We have been trying to set up a production level Kafka cluster in AWS Linux machines and till now we have been unsuccessful.

Kafka version: 2.1.0

Machines:

5 r5.xlarge machines for 5 Kafka brokers.
3 t2.medium zookeeper nodes
1 t2.medium node for schema-registry and related tools. (a Single instance of each)
1 m5.xlarge machine for Debezium.

Default broker configuration :

num.partitions=15
min.insync.replicas=1
group.max.session.timeout.ms=2000000 
log.cleanup.policy=compact
default.replication.factor=3
zookeeper.session.timeout.ms=30000

Our problem is mainly related to huge data. We are trying to transfer our existing tables in kafka topics using debezium. Many of these tables are quite huge with over 50000000 rows.

Till now, we have tried many things but our cluster fails every time with one or more reasons.

ERROR Uncaught exception in scheduled task 'isr-expiration' (kafka.utils.KafkaScheduler) org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /brokers/topics/__consumer_offsets/partitions/0/state at org.apache.zookeeper.KeeperException.create(KeeperException.java:130) at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)..

Error 2:

] INFO [Partition xxx.public.driver_operation-14 broker=3] Cached zkVersion [21] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition) [2018-12-12 14:07:26,551] INFO [Partition xxx.public.hub-14 broker=3] Shrinking ISR from 1,3 to 3 (kafka.cluster.Partition) [2018-12-12 14:07:26,556] INFO [Partition xxx.public.hub-14 broker=3] Cached zkVersion [3] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition) [2018-12-12 14:07:26,556] INFO [Partition xxx.public.field_data_12_2018-7 broker=3] Shrinking ISR from 1,3 to 3 (kafka.cluster.Partition)

Error 3:

isolationLevel=READ_UNCOMMITTED, toForget=, metadata=(sessionId=888665879, epoch=INITIAL)) (kafka.server.ReplicaFetcherThread) java.io.IOException: Connection to 3 was disconnected before the response was read at org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:97)

Some more errors :

  1. Frequent disconnections among broker which probably is the reason behind nonstop shrinking and expanding ISRs with no auto recovery.
  2. Schema registry gets timed out. I don't know how is schema registry even affected. I don't see too much load on that server. Am I missing something? Should I use a Load balancer for multiple instances of schema Registry as failover?. The topic __schemas has just 28 messages in it. The exact error message is RestClientException: Register operation timed out. Error code: 50002
  3. Sometimes the message transfer rate is over 100000 messages per second, sometimes it drops to 2000 messages/second? message size could cause this?

    In order to solve some of the above problems, we increased the number of brokers and increased zookeeper.session.timeout.ms=30000 but I am not sure if it actually solved the our problem and if it did, how?.

I have a few questions:

  1. Is our cluster good enough to handle this much data.
  2. Is there anything obvious that we are missing?
  3. How can I load test my setup before moving to the production level?
  4. What could cause the session timeouts between brokers and the schema registry.
  5. Best way to handle the schema registry problem.

Network Load on one of our Brokers.

Network Bytes In one of our broker

Feel free to ask for any more information.

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Ankur rana
  • 580
  • 10
  • 27
  • 1
    2.11 sounds like a Scala version, not Kafka version. Kafka version would be more something like 2.0.0, 2.0.1 or the latest which is 2.1.0. – Jakub Dec 13 '18 at 09:53
  • 1
    I meant https://kafka.apache.org/downloads#2.1.0 – Ankur rana Dec 13 '18 at 11:13
  • How much heap did you give to Kafka? Did you change any other settings? – OneCricketeer Dec 13 '18 at 15:20
  • I have set export KAFKA_HEAP_OPTS="-Xmx1G -Xms1G". – Ankur rana Dec 13 '18 at 17:16
  • 1
    That is not enough... If you [read over this section](http://kafka.apache.org/documentation/#config), you'll notice that `6G` (and even more memory on the host itself) is mentioned for LinkedIn. And I am in an environment with `10G` and 252G total RAM on the host. Also, see docs by Confluent https://docs.confluent.io/current/kafka/deployment.html. Plus if doing a distributed deployment like this, possibly [use Ansible](https://github.com/confluentinc/cp-ansible) or [CloudFormation](https://github.com/aws-quickstart/quickstart-confluent-kafka) – OneCricketeer Dec 13 '18 at 23:48
  • What would you propose for our system? Our brokers have 32 GB RAM total. – Ankur rana Dec 14 '18 at 05:30
  • I mean each broker has 32 GB of RAM. – Ankur rana Dec 14 '18 at 06:14
  • I have increased Xmx10G. – Ankur rana Dec 14 '18 at 09:30
  • It's recommended to be between 4 and 8 – OneCricketeer Jan 10 '19 at 20:30
  • 1
    I tuned it again to Xmx3G as it worked best for us. The real problem seems to be with io wait. Our application was hogged in io wait. I am currently looking in that direction. – Ankur rana Jan 11 '19 at 02:40

1 Answers1

0

Please Use The latest official version of the Confluent for you cluster.

Actually you can make it better by increasing the number of partitions of your topics and also increasing the tasks.max(of course in your sink connectors) more than 1 in your connector to work more concurrently and faster.

Please increase the number of Kafka-Connect topics and use Kafka-Connect distributed mode to increase the High Availability of your Kafka-connect cluster. you can make it by setting the number of replication factor in the Kafka-Connectand Schema-Registry config for example:

config.storage.replication.factor=2
status.storage.replication.factor=2
offset.storage.replication.factor=2

Please set the topic compression to snappy for your large tables. it will increase the throughput of the topics and this helps the Debezium connector to work faster and also do not use JSON Converter it's recommended to use Avro Convertor!

Also please use load-balancer for your Schema Registry

For testing the cluster you can create a connector with only one table (I mean a large table!) with the database.whitelist and set snapshot.mode to initial

And About the schema-registry! Schema-registry user both Kafka and Zookeeper with setting these configs:

bootstrap.servers
kafkastore.connection.url

And this is the reason of your downtime of the shema-registry cluster

Hossein Torabi
  • 694
  • 1
  • 7
  • 18