Questions tagged [kafka-python]

Kafka-Python provides low-level protocol support for Apache Kafka as well as high-level consumer and producer classes. Request batching is supported by the protocol as well as broker-aware request routing. Gzip and Snappy compression is also supported for message sets.

kafka-python provides low-level protocol support for Apache Kafka as well as high-level consumer and producer classes. Request batching is supported by the protocol as well as broker-aware request routing. Gzip and Snappy compression is also supported for message sets.

For more details about Python Kafka Client API, please refer https://kafka-python.readthedocs.io/en/latest/

443 questions
8
votes
1 answer

how to properly use pyspark to send data to kafka broker?

I'm trying to write a simple pyspark job, which would receive data from a kafka broker topic, did some transformation on that data, and put the transformed data on a different kafka broker topic. I have the following code, which reads data from a…
Eugene Goldberg
  • 14,286
  • 20
  • 94
  • 167
8
votes
1 answer

Does Kafka guarantee message ordering within a single partition with ANY config param values?

If I set Kafka config param at Producer as: 1. retries = 3 2. max.in.flight.requests.per.connection = 5 then its likely that Messages within one partition may not be in send_order. Does Kafka takes any extra step to make sure that messages within a…
user3851499
8
votes
2 answers

how to send JSON object to kafka from python client

I have a simple JSON object like the following d = { 'tag ': 'blah', 'name' : 'sam', 'score': {'row1': 100, 'row2': 200 } } The following is my python code which is sending messages to Kafka from kafka import SimpleProducer,…
Rahul
  • 11,129
  • 17
  • 63
  • 76
7
votes
0 answers

kafka-python: Closing the kafka producer with 0 vs inf secs timeout

I am trying to produce the messages to a Kafka topic using kafka-python 2.0.1 using python 2.7 (can't use Python 3 due to some workplace-related limitations) I created a class as below in a separate and compiled the package and installed in virtual…
Anurag Rana
  • 1,429
  • 2
  • 24
  • 48
7
votes
1 answer

How to add a failure callback for kafka-python kafka.KafkaProducer#send()?

I would like to set a callback to be fired if a produced records fail. Initially, I would just like to log the failed record. The Confluent Kafka python library provides a mechanism for adding a callback: produce(topic[, value][, key][,…
Chris Snow
  • 23,813
  • 35
  • 144
  • 309
7
votes
1 answer

Sending Large CSV to Kafka using python Spark

I am trying to send a large CSV to kafka. The basic structure is to read a line of the CSV and zip it with the header. a = dict(zip(header, line.split(",") This then gets converted to a json with: message = json.dumps(a) I then use kafka-python…
6
votes
4 answers

How to count number of records (message) in the topic using kafka-python

As said in the title, i want to get a number of record in my topic and i can't find a solution using kafka-python library. Does anyone have any idea ?
LilyAZ
  • 133
  • 3
  • 10
6
votes
1 answer

How to find the schema id from schema registry used for avro records, when reading from kafka consumer

We use schema registry for storing schemas, and messages are serialised to avro and pushed to kafka topics. Wanted to know, when reading data from consumer, how to find the schema id, for which the avro record is serialised. We require this schema…
6
votes
2 answers

Kafka-python How to consume json message

I am a fairly new in Python and starting with Kafka. I have a requirement where I need to send and consume json messages. For this I am using kafka-python to communicate with Kafka. #Producer.py from kafka import KafkaProducer import json producer…
Paras
  • 3,191
  • 6
  • 41
  • 77
6
votes
3 answers

Kafka not receiving messages when indicating group_id in Python

I am using Kafka (kafka-python) version 3.0.0-1.3.0.0.p0.40. I need to configure the consumer for the topic 'simulation' in Python. When I don't indicate the group_id, i.e. group_id = None it receives messages fine. However if I indicate the…
6
votes
1 answer

How to get the latest offset value from a confluent_python AVRO consumer

I am pretty new to confluent_kafka but I've gained some experience with kafka-python. What I am trying to do is changing the offset where to start consuming messages. This why I'd like to build a consumer client able to move back to previous…
hellbreak
  • 361
  • 4
  • 14
6
votes
3 answers

For AvroProducer to Kafka, where are avro schema for "key" and "value"?

From the AvroProducer example in the confluent-kafka-python repo, it appears that the key/value schema are loaded from files. That is, from this code: from confluent_kafka import avro from confluent_kafka.avro import AvroProducer value_schema =…
Kode Charlie
  • 1,297
  • 16
  • 32
6
votes
4 answers

kafka-python read from last produced message after a consumer restart

i am using kafka-python to consume messages from a kafka queue (kafka version 0.10.2.0). In particular i am using KafkaConsumer type. If the consumer stops and after a while it is restarted i would like to restart from the latest produced message,…
ugomaria
  • 185
  • 1
  • 2
  • 8
6
votes
2 answers

Kafka 10 - Python Client with Authentication and Authorization

I have a Kafka10 cluster with SASL_SSL (Authentication( JAAS ) and Authorization) enabled. Able to connect thru SASL using the Java client with the below…
user1578872
  • 7,808
  • 29
  • 108
  • 206
6
votes
1 answer

Python: Mocking out Kafka for integration tests

I'm somewhat new to integration tests. I have two services that pass messages to one another using Kafka. However, for my integration tests, I don't necessarily want to have Kafka running in order to run my tests. Is there a standard way to mock out…
user1658296
  • 1,398
  • 2
  • 18
  • 46
1 2
3
29 30