I don't want to write processed KStream to another topic, I directly want to write enriched KStream to database. How should I proceed?
Asked
Active
Viewed 9,857 times
8
-
1As Matthias says in his answer, this is not a good design pattern to be following. You couple your streams application to your database this way. Much much better is to write back into Kafka, and then use Kafka Connect to stream the data to the database. – Robin Moffatt Oct 03 '17 at 03:58
1 Answers
12
You can implement a custom Processor
that opens a DB connection and apply it via KStream#process()
. Cf. https://docs.confluent.io/current/streams/developer-guide/dsl-api.html#applying-processors-and-transformers-processor-api-integration
Note, you will need to do sync writes into your DB to guard against data loss.
Thus, not writing back to a topic has multiple disadvantages:
- reduced throughput because of sync writes
- you cannot use exactly-once semantics
- coupling your application with the database (if DB goes down, your app goes down, too, as it can't write its results anymore)
Therefore, it's recommended to write the results back into a topic and use Connect API to get the data into your database.

Boris
- 31
- 9

Matthias J. Sax
- 59,682
- 7
- 117
- 137
-
1Thanks Matthias, it really gave me a direction to think if i can change my design. However problem in my case is i have too many topics(per machine) and creating same number of topic to ingest transformed stream. I dont know how will it behave when have very large number of topics. – Megha Oct 03 '17 at 14:15
-
1I understand what you are saying. In general, you should just scale out your Kafka cluster by adding more brokers to handle an increased load. – Matthias J. Sax Oct 03 '17 at 18:01
-
@MatthiasJ.Sax I know this might be an old answer, but I am facing something similar. We need to output the events in a topic to SQS. Which connector would you recommend that is safe & reliable from a resilience point of view? If the connection to SQS fails or some other issue arises. Should this be better handled with classic consumer&producers? – Lucian Oct 27 '22 at 11:24
-
Not sure from the top of my head. You can try https://www.confluent.io/hub/ (*Disclaimer: I work for Confluent) to see if there is a good connector for SQS. In general, I would recommend to use Connect framework -- if there is no available connector, it's still better to implement a custom connector instead of using the consumer API, as Connect framework will do a lot of heavy lifting for you that you would need to re-implement using the plain consumer API. – Matthias J. Sax Nov 01 '22 at 16:55