0

I have a file which contains multiple json documents in the following format.

{"attribute1": "value1", "attribute2": "value2", "attribute3": "value3", "attribute4": "value4"} {"attribute1": "value11", "attribute2": "value12", "attribute3": "value13", "attribute4": "value14"} {"attribute1": "value21", "attribute22": "value2", "attribute23": "value3", "attribute4": "value24"}

I am trying to send the individual json documents to kafka. The script executes with an exit code 0 but I can see no messages coming through on the KAFKA consumer. I am not sure where I am going wrong.

My code is as follows:

import csv
import json

bootstrap = ['hostname:9092']
valueSerializer = lambda x: dumps(x).encode('utf-8')

producer = KafkaProducer(bootstrap_servers = bootstrap, value_serializer = valueSerializer)

table = []
with open('~/json_file_name.json', 'r') as json_file:
    for line in json_file:
        table.append(json.loads(line))

#numrows = len(table)
#print(numrows)

for row in table:
    print(row)
    producer.send('Topic_Name', value=row)

jsn
  • 17
  • 3

1 Answers1

1

It's likely you're not sending enough data for the producer to flush its batch. You've not shown your import for the KafkaProducer, but see if you can do producer.flush() at the end of the script


You don't need a table variable, by the way, just send while you read the file lines. You also don't need to dumps(x) since you are sending strings obtained by json.loads already

You can also remove the csv import

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
  • 1
    I added producer.flush() at the end of the script and the data was sent to Kafka – jsn Jan 03 '20 at 18:25