1

Hi I am working on kafka. I am trying to understand basics of kafka. I am learning kafka now. I installed kafka using docker. Currently I have one broker. I created topic with 3 partitions using below command.

kafka-topics --create --zookeeper zookeeper:2181 --replication-factor 1 --partitions 3 --topic topic2

After that I created producer as below.

 kafka-console-producer --broker-list localhost:9092 --topic topic2
 >This is my producer

I am totally confused here. When I add above data, my data sits in three partitions or one partition? because above I created three partitions. In partition we have offset starting with zero. So in the above example, When I enter This is my producer whole text will sit at offset 0 or one character sit in one offset? This is very basic I know but none of the documentation talks about this!

Next coming to consumer part, If I want to consume some data, If data sitting in different partition, How data will come from different partitions or how data consolidation will happen? Can someone help me to understand basics? Any help would be greatly appreciated. Thanks

Giorgos Myrianthous
  • 36,235
  • 20
  • 134
  • 156
Niranjan
  • 1,881
  • 6
  • 44
  • 71

2 Answers2

3

Partitioning

Each message will be assigned to a different partition in a round-robin fashion. However, messages with the same key will be inserted to the same partition.

Consumers

If you have N partitions, then you can have up to N consumers within the same consumer group each of which reading from a single partition. When you have less consumers than partitions, then some of the consumers will read from more than one partition. Also, if you have more consumers than partitions then some of the consumers will be inactive and will receive no messages at all.

Giorgos Myrianthous
  • 36,235
  • 20
  • 134
  • 156
  • Thanks for your answer. One more question I asked if I add some line of text then entire line of text will be stored in offset O or each character will be saved in one one offsets? – Niranjan Nov 15 '19 at 09:18
  • The message will be stored as a whole into a partition and it won't be split into characters. – Giorgos Myrianthous Nov 15 '19 at 09:20
  • So we have offsets starting with 0,1,3,4 and so on. My producer is sending data hello world. So hello world will be in offset 0 right? – Niranjan Nov 15 '19 at 09:22
  • Assuming that this is the first message, then yes. The first message e.g. `hello world` will be assgined offset `0` in one of the partitions. Note that offsets are maintained on partition level. This means that if a second message arrives to the topic and needs to be added to another partition, then it will be assigned offset `0` as well (but this time in a different partition). – Giorgos Myrianthous Nov 15 '19 at 09:26
  • Ok thanks. In one offset how much data we can store? Is there any limitations on size? – Niranjan Nov 15 '19 at 09:56
  • @Niranjan You cannot store anything in the offset. An offset is just a numerical sequential ID, which is used to uniquely identify each record within a partition. – Giorgos Myrianthous Nov 15 '19 at 11:18
1

Each message would be sent to 1 partition.

But do mind the ordering of the messages while creating multiple partitions. Order of messages within a partition is guaranteed.

So if you write a consumer and you start reading from the beginning then the order would not be Total

Subham Saraf
  • 421
  • 2
  • 4
  • 13