2

I'm designing a web log analytic.

And I found an architect with Django(Back-end & front-end)+ kafka + spark.

I also found some same system from this link:http://thevivekpandey.github.io/posts/2017-09-19-high-velocity-data-ingestion.html with below architect

enter image description here

But I confuse about the role of kafka-consumer. It will is a service, independent to Django, right?

So If I want to plot real-time data to front-end chart, how to I attached to Django.

It will too ridiculous if I place both kafka-consumer & producer in Django. Request from sdk come to Django by pass to kafa topic (producer) and return Django (consumer) for process. Why we don't go directly. It looks simple and better.

Please help me to understand the role of kafka consumer, where it should belong? and how to connect to my front-end.

Thanks & best Regards,

Jame

Jame H
  • 1,324
  • 4
  • 15
  • 26

1 Answers1

0

The article mentions about the use case without Kafka:

We saw that in times of peak load, data ingestion was not working properly: it was taking too long to connect to MongoDB and requests were timing out. This was leading to data loss.

So the main point of introducing Kafka and Kafka Consumer is to avoid too much load on DB layer and handle it gracefully with a messaging layer in between. To be honest, any message queue can be used in this case, not only Kafka.

Kafka Consumer can be a part of the web layer. It wouldn't be optimal, because you want the separation of concerns (which makes the system more reliable in case of failures) and ability to scale things independently.

It's better to implement the Kafka Consumer as a separate service if the concerns mentioned above really matter (scalability and reliability) and it's easy for you to do operationally (because you need to deploy, monitor, etc. a new service now). In the end it's a classic monolith vs. microservices dilemma.

sap1ens
  • 2,877
  • 1
  • 27
  • 30
  • Thank a lot. In case of separate of kafka-consumer, If i use python, how could I create a service for kafka-consumer (which just listen, get data and put in database). And could I use pyspark for streaming in that case. Please help me, I'm newbie in microservice, I used docker to make kafka, but I don't know how to create service as kafka-consumer. – Jame H Nov 29 '17 at 01:59
  • I suggest to start with a basic consumer using https://github.com/confluentinc/confluent-kafka-python. You can also take a look at the Kafka Connect (https://docs.confluent.io/current/connect/intro.html) – sap1ens Nov 29 '17 at 19:40
  • PyKafka https://github.com/parsely/pykafka will be easier to integrate with Django since it has a more pythonic API and clearer documentation. – Emmett Butler Feb 27 '18 at 23:17
  • I know its 2019, but since nobody has mentioned it, you might consider using PushPin, which acts as a Kafka Consumer which you couple to Server-Sent-Events (SSE) so you can stream updates to your webclient. https://www.otago.ac.nz/its/forms/otago604826.html – Cameron Kerr Jun 08 '19 at 10:37