I am using the pykafka library to post messages on Kafka. My data set is a JSON
{"user": "jpoole", "created_at_unixtime": 1440407147.033846, "id": 3600730356622213650, "text": "Techical support for my new computer as A+, thank you @fudgemart", "created_at": "Mon Aug 24 05:05:47 +0000 2015"}
]
My requirement is to generate 2 kafka messages, 1 for each JSON string above using PyKafka. I have tried the following so far.
from pykafka import KafkaClient
client = KafkaClient(hosts="127.0.0.1:9092")
topic = client.topics['test']
with open('./tweets.json') as f:
dataItems =json.load(f)
s=json.dumps(dataItems).encode('utf-8')
with topic.get_sync_producer() as producer:
for data in s:
producer.produce(data)
I have the JSON loaded into a file (my original requirement). The above code works but it doesn't take the first JSON string as a whole but instead takes every character in the string as a message.
My requirement is to publish each JSON string as a separate Kafka message.
Message 1
{"user": "jpoole", "created_at_unixtime": 1448221456.6646008, "id": 3731785240073317438, "text": "Glad I bought my electronics from @fudgemart", "created_at": "Sun Nov 22 14:44:16 +0000 2015"}
Message 2
{"user": "jpoole", "created_at_unixtime": 1440407147.033846, "id": 3600730356622213650, "text": "Techical support for my new computer as A+, thank you @fudgemart", "created_at": "Mon Aug 24 05:05:47 +0000 2015"}
Thanks