I have custom function forward_to_kafka(list: List)
which sends my events to kafka even if there are problems, it's purpose to deliver all messages in my list. I have already tested it with JB Plugin Big Data Tools and it works correctly.
But now I need to write automated integration tests and I already have this one:
@pytest.mark.parametrize("topic, hits", [
("topic1", []),
("topic3", ["ev1", "ev2", "ev3"]),
("topic2", ["event"]),
("topic4", ['{"event": ["arr_elem"]}', '{"event_num": 13}', '{"event": {"subev": "value"}}']),
])
@pytest.mark.integration
def test_forward_to_kafka_integration(self, topic, hits, output):
kafka_host = 'localhost:9094'
producer = KafkaProducer(bootstrap_servers=[kafka_host], acks='all',)
output.forward_to_kafka(producer, topic, [message.encode() for message in hits])
consumer = KafkaConsumer(topic, bootstrap_servers=[kafka_host],
group_id=f"{topic}_grp", auto_offset_reset='earliest',
consumer_timeout_ms=1000)
received_messages = [message.value.decode() for message in consumer]
print(received_messages)
assert all([message in received_messages for message in hits])
I have a kafka in docker container, connection is ok, but the 2nd test always fails. To be more correct, if I run empty kafka without saved data, just clear image, the test with first action of pushing always fails.
The result of executing 2nd test below:
FAILED [ 50%][]
test_output.py:90 (TestForwardToKafka.test_forward_to_kafka_integration[topic3-hits1])
self = <tests.test_output.TestForwardToKafka object at 0x107b49520>
topic = 'topic3', hits = ['ev1', 'ev2', 'ev3']
output = <h3ra.output.Output object at 0x107c14b80>
@pytest.mark.parametrize("topic, hits", [
("topic1", []),
("topic3", ["ev1", "ev2", "ev3"]),
("topic2", ["event"]),
("topic4", ['{"event": ["arr_elem"]}', '{"event_num": 13}', '{"event": {"subev": "value"}}']),
])
@pytest.mark.integration
def test_forward_to_kafka_integration(self, topic, hits, output):
kafka_host = 'localhost:9094'
producer = KafkaProducer(bootstrap_servers=[kafka_host], acks='all',)
output.forward_to_kafka(producer, topic, [message.encode() for message in hits])
consumer = KafkaConsumer(topic, bootstrap_servers=[kafka_host],
group_id=f"{topic}_grp", auto_offset_reset='earliest',
consumer_timeout_ms=1000)
received_messages = [message.value.decode() for message in consumer]
print(received_messages)
> assert all([message in received_messages for message in hits])
E assert False
E + where False = all([False, False, False])
test_output.py:107: AssertionError
as you see, array with received_messages is empty.
But when I connect with kafka plugin from Big Data Tools, I see that this messages were delivered.
What am I doing wrong?
Edit
To reproduce, you can replace output.forward_to_kafka(producer, topic, [message.encode() for message in hits])
with
for message in hits:
producer.send(topic, message)
producer.flush()
In common, it's all, what my function do.
There is my docker-compose.yml file
version: "3.9"
services:
kafka: # DNS-1035
hostname: kafka
image: docker-proxy.artifactory.tcsbank.ru/bitnami/kafka:3.5
expose:
- "9092"
- "9093"
- "9094"
volumes:
- "kafka_data:/bitnami"
environment:
- ALLOW_PLAINTEXT_LISTENER=yes
- KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,CONTROLLER://:9093,EXTERNAL://:9094
- KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka:9092,EXTERNAL://kafka:9094
- KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CONTROLLER:PLAINTEXT,EXTERNAL:PLAINTEXT,PLAINTEXT:PLAINTEXT
- KAFKA_CFG_AUTO_CREATE_TOPICS_ENABLE=true
- KAFKA_CFG_NODE_ID=0
- KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=0@kafka:9093
- KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER
- KAFKA_CFG_PROCESS_ROLES=controller,broker
volumes:
kafka_data:
driver: local