I have a system where I start two separate processes, one that produces messages to a kafka topic, and one process that runs a number of KafkaConsumer instances, and consumes the messages. However, it appears that only one of the consumers is actually consuming the messages, whereas the second one is stuck doing nothing.
I have created a topic in Kafka with four instances using Kafdrop
. As far as I understand this should mean that if I have two processes in Airflow running KafkaConsumer, they should split the available instances between themselves and consume the messages from their "assigned" instance(s). The two KafkaConsumer instances belong to the same group-id
.
In the Airflow logs I only see one of the KafkaConsumers showing that it is processing messages. Below I have attached the python code to generate and to consume kafka messages, and also the Airflow scripts that I use to start the process of generating and consuming messages. At the end I also have one log from the process that does not consume any messages.
import airflow
from airflow.models import DAG
from airflow.operators.python_operator import PythonOperator
from preprocessing.testers import create_msg, print_current_time
import random
args = {
'owner': 'airflow',
'start_date': airflow.utils.dates.days_ago(1),
'provide_context': True,
}
dag = DAG(
dag_id='creating_msgs',
default_args=args,
schedule_interval= '@once',
catchup=True,
)
create_msgs = [
PythonOperator(
task_id='create_msg_'+str(i+1),
python_callable=create_msg,
op_kwargs={'task_id': i+1, 'nbr': random.randrange(50,100)},
dag=dag) for i in range(2)
]
print_time_task = PythonOperator(
task_id='print_time',
python_callable=print_current_time,
dag=dag
)
create_msgs >> print_time_task
Script to producing kafka message
def create_msg(**kwargs):
producer = KafkaProducer(bootstrap_servers=['kafka:9092'], # set up Producer
value_serializer=lambda x: json.dumps(x).encode('utf-8'))
for i in range(kwargs['nbr']):
task_name = 'Task' + str(kwargs['task_id']) + '_' + str(i+1)
logging.warn('Producing message of task: ' + task_name)
producer.send('TopicA', {'name': task_name})
sleep_time = random.randrange(1,5)
logging.warn('Sleeping for ' + str(sleep_time))
sleep(sleep_time)
producer.close()
Script to consuming kafka messages
import airflow
from airflow.models import DAG
from airflow.operators.python_operator import PythonOperator
from preprocessing.testers import print_current_time
from crawling.crawler import start_task
import random
args = {
'owner': 'airflow',
'start_date': airflow.utils.dates.days_ago(1),
'provide_context': True,
}
dag = DAG(
dag_id='crawl_listings',
default_args=args,
schedule_interval= '@once',
catchup=True,
)
start_tasks = [
PythonOperator(
task_id='start_task'+str(i+1),
python_callable=create_task,
op_kwargs={'task_id': i+1},
dag=dag) for i in range(2)
]
print_time_task = PythonOperator(
task_id='print_time',
python_callable=print_current_time,
dag=dag
)
start_tasks >> print_time_task
Task to run KafkaConsumer (there are two instances created of this).
def start_task(**kwargs):
consumer = KafkaConsumer(
'TopicA',
bootstrap_servers=['kafka:9092'],
consumer_timeout_ms=3000,
auto_offset_reset='earliest',
enable_auto_commit=True,
group_id='my-group',
value_deserializer=lambda x: json.loads(x.decode('utf-8')))
try:
for message in consumer:
logging.info('Processing: ' + message.value['name'])
sleep_time = random.randrange(1,5)
logging.warn('Sleeping for ' + str(sleep_time))
sleep(sleep_time)
logging.info('Done')
consumer.close()
except Exception as e:
print(e)
logging.error('Error: ' + e)
Here is the Airflow log from the task containing the KafkaConsumer which does not process any messages. The other process
[2022-09-15 13:23:11,737] {{taskinstance.py:887}} INFO - Executing <Task(PythonOperator): start_crawling_task2> on 2022-09-14T00:00:00+00:00
[2022-09-15 13:23:11,739] {{standard_task_runner.py:53}} INFO - Started process 3319 to run task
[2022-09-15 13:23:11,785] {{logging_mixin.py:112}} INFO - Running %s on host %s <TaskInstance: crawl_listings.start_crawling_task2 2022-09-14T00:00:00+00:00 [running]> f68c0ccde6c0
[2022-09-15 13:23:11,805] {{logging_mixin.py:112}} INFO - [2022-09-15 13:23:11,804] {{conn.py:372}} INFO - <BrokerConnection node_id=bootstrap-0 host=kafka:9092 <connecting> [IPv4 ('172.19.0.6', 9092)]>: connecting to kafka:9092 [('172.19.0.6', 9092) IPv4]
[2022-09-15 13:23:11,805] {{logging_mixin.py:112}} INFO - [2022-09-15 13:23:11,805] {{conn.py:1049}} INFO - Probing node bootstrap-0 broker version
[2022-09-15 13:23:11,807] {{logging_mixin.py:112}} INFO - [2022-09-15 13:23:11,806] {{conn.py:401}} INFO - <BrokerConnection node_id=bootstrap-0 host=kafka:9092 <connecting> [IPv4 ('172.19.0.6', 9092)]>: Connection complete.
[2022-09-15 13:23:11,913] {{logging_mixin.py:112}} INFO - [2022-09-15 13:23:11,913] {{conn.py:1106}} INFO - Broker version identifed as 1.0.0
[2022-09-15 13:23:11,914] {{logging_mixin.py:112}} INFO - [2022-09-15 13:23:11,913] {{conn.py:1108}} INFO - Set configuration api_version=(1, 0, 0) to skip auto check_version requests on startup
[2022-09-15 13:23:11,914] {{logging_mixin.py:112}} INFO - [2022-09-15 13:23:11,914] {{subscription_state.py:171}} INFO - Updating subscribed topics to: ('TopicA',)
[2022-09-15 13:23:12,121] {{logging_mixin.py:112}} INFO - [2022-09-15 13:23:12,121] {{conn.py:372}} INFO - <BrokerConnection node_id=1 host=kafka:9092 <connecting> [IPv4 ('172.19.0.6', 9092)]>: connecting to kafka:9092 [('172.19.0.6', 9092) IPv4]
[2022-09-15 13:23:12,222] {{logging_mixin.py:112}} INFO - [2022-09-15 13:23:12,222] {{conn.py:401}} INFO - <BrokerConnection node_id=1 host=kafka:9092 <connecting> [IPv4 ('172.19.0.6', 9092)]>: Connection complete.
[2022-09-15 13:23:12,222] {{logging_mixin.py:112}} INFO - [2022-09-15 13:23:12,222] {{conn.py:811}} INFO - <BrokerConnection node_id=bootstrap-0 host=kafka:9092 <connected> [IPv4 ('172.19.0.6', 9092)]>: Closing connection.
[2022-09-15 13:23:12,427] {{logging_mixin.py:112}} INFO - [2022-09-15 13:23:12,427] {{cluster.py:376}} INFO - Group coordinator for my-group is BrokerMetadata(nodeId=1, host='kafka', port=9092, rack=None)
[2022-09-15 13:23:12,428] {{logging_mixin.py:112}} INFO - [2022-09-15 13:23:12,427] {{base.py:688}} INFO - Discovered coordinator 1 for group my-group
[2022-09-15 13:23:12,428] {{logging_mixin.py:112}} INFO - [2022-09-15 13:23:12,428] {{base.py:735}} INFO - Starting new heartbeat thread
[2022-09-15 13:23:12,428] {{logging_mixin.py:112}} INFO - [2022-09-15 13:23:12,428] {{consumer.py:342}} INFO - Revoking previously assigned partitions set() for group my-group
[2022-09-15 13:23:12,429] {{logging_mixin.py:112}} INFO - [2022-09-15 13:23:12,429] {{base.py:447}} INFO - (Re-)joining group my-group
[2022-09-15 13:23:16,647] {{logging_mixin.py:112}} INFO - [2022-09-15 13:23:16,647] {{base.py:333}} INFO - Successfully joined group my-group with generation 2
[2022-09-15 13:23:16,647] {{logging_mixin.py:112}} INFO - [2022-09-15 13:23:16,647] {{subscription_state.py:257}} INFO - Updated partition assignment: [TopicPartition(topic='TopicA', partition=2), TopicPartition(topic='TopicA', partition=3)]
[2022-09-15 13:23:16,648] {{logging_mixin.py:112}} INFO - [2022-09-15 13:23:16,648] {{consumer.py:239}} INFO - Setting newly assigned partitions {TopicPartition(topic='TopicA', partition=2), TopicPartition(topic='TopicA', partition=3)} for group my-group
[2022-09-15 13:23:16,651] {{logging_mixin.py:112}} INFO - [2022-09-15 13:23:16,650] {{crawler.py:25}} INFO - Done
[2022-09-15 13:23:16,654] {{logging_mixin.py:112}} INFO - [2022-09-15 13:23:16,654] {{base.py:742}} INFO - Stopping heartbeat thread
[2022-09-15 13:23:16,654] {{logging_mixin.py:112}} INFO - [2022-09-15 13:23:16,654] {{base.py:767}} INFO - Leaving consumer group (my-group).
[2022-09-15 13:23:16,661] {{logging_mixin.py:112}} INFO - [2022-09-15 13:23:16,661] {{conn.py:811}} INFO - <BrokerConnection node_id=1 host=kafka:9092 <connected> [IPv4 ('172.19.0.6', 9092)]>: Closing connection.
[2022-09-15 13:23:16,662] {{python_operator.py:114}} INFO - Done. Returned value was: None
[2022-09-15 13:23:16,670] {{taskinstance.py:1048}} INFO - Marking task as SUCCESS.dag_id=crawl_listings, task_id=start_crawling_task2, execution_date=20220914T000000, start_date=20220915T132311, end_date=20220915T132316
[2022-09-15 13:23:21,700] {{logging_mixin.py:112}} INFO - [2022-09-15 13:23:21,700] {{local_task_job.py:103}} INFO - Task exited with return code 0