3

I want to send a large message from producer to Kafka so I've changed below properties.

Broker (server.properties)

replica.fetch.max.bytes=317344026
message.max.bytes=317344026
max.message.bytes=317344026
max.request.size=317344026

Producer (producer.properties)

max.request.size=3173440261

Consumer (consumer.properties)

max.partition.fetch.bytes=327344026
fetch.message.max.bytes=317344026

Still I'm getting some error showing as below when I use python Popen and cli command of kafta to run producer.

Code:

def producer(topic_name, content):
    p = subprocess.Popen(['/opt/kafka/kafka_2.11-0.9.0.0/bin/kafka-console-producer.sh', '--broker-list', 'localhost:9092', '--topic', 'Hello-Kafka'], stdout=subprocess.PIPE, stdin=subprocess.PIPE)
    p.stdin.write(content)
    out, err = p.communicate()
    print out

Error:

ERROR Error when sending message to topic Hello-Kafka with key: null, value: 1677562 bytes with error: The message is 1677588 bytes when serialized which is larger than the maximum request size you have configured with the max.request.size configuration. (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)

And I'm getting below error when I uses python module for kafka (https://github.com/dpkp/kafka-python)

Code:

def producer(topic_name, content):
    p = KafkaProducer(bootstrap_servers='localhost:9092')
    a = p.send(topic_name, content).get()
    print a    
    p.flush()
    p.close()

Error:

kafka.errors.MessageSizeTooLargeError: [Error 10] MessageSizeTooLargeError: The message is 217344026 bytes when serialized which is larger than the maximum request size you have configured with the max_request_size configuration

One thing that I've tried successfully is by dividing content in chunks but if anyone has any solution to do this without dividing content.

Mickael Maison
  • 25,067
  • 7
  • 71
  • 68
Vatsal Jagani
  • 83
  • 1
  • 8
  • what kind of serialization are you using? – AbhishekN Aug 09 '18 at 16:43
  • Kafka brokers are not designed to handle 300MB messages. You will find yourself having poor performance unless you have tons of free memory and are an expert in Linux/Java memory management. The best strategy is to break it up. That said, in your example, you didn't pass the producer properties file to the console producer so you didn't set the config (at least as written). – dawsaw Aug 10 '18 at 03:44
  • @AbhishekN, I'm sending it in the string directly as I read it from the file. – Vatsal Jagani Aug 10 '18 at 04:22
  • @dawsaw, What do you mean that "you didn't pass the producer properties file to the console producer" - How can I do that? Can you please provide any reference? – Vatsal Jagani Aug 10 '18 at 04:25

2 Answers2

1

kafka-console-producer.sh

You didn't use your producer.properties file when calling kafka-console-producer.sh.
Use --producer.config flag.

KafkaProducer

Your KafkaProducer is using default values. You have to set max_request_size when calling it.
See KafkaProducer doc

KafkaProducer(bootstrap_servers='localhost:9092', max_request_size=3173440261)
Gery
  • 609
  • 4
  • 9
1

Your string size is really huge, it's not really a message to be used in a queue based system, do rethink about your platform's architecture. Having said that you can try compression configurations and see if they help.

Kafka Data compression: There are two ways for data compression on Kafka, producer side and broker side. Both have pros and cons, I have found (and I think others also recommend too) that producer side compression is better since it gives better batch optimization.

"compression.codec"="2"
"compressed.topics"="<your-topic-name>"

(0: No compression, 1: GZIP compression, 2: Snappy compression, 3: LZ4 compression)

Further read: Compression ideas

AbhishekN
  • 368
  • 4
  • 8