I want to setup Flink so it would transform and redirect the data streams from Apache Kafka to MongoDB. For testing purposes I'm building on top of flink-streaming-connectors.kafka example (https://github.com/apache/flink).
Kafka streams are being properly red by Flink, I can map them etc., but the problem occurs when I want to save each recieved and transformed message to MongoDB. The only example I've found about MongoDB integration is flink-mongodb-test from github. Unfortunately it uses static data source (database), not the Data Stream.
I believe there should be some DataStream.addSink implementation for MongoDB, but apparently there's not.
What would be the best way to achieve it? Do I need to write the custom sink function or maybe I'm missing something? Maybe it should be done in different way?
I'm not tied to any solution, so any suggestion would be appreciated.
Below there's an example what exactly i'm getting as an input and what I need to store as an output.
Apache Kafka Broker <-------------- "AAABBBCCCDDD" (String)
Apache Kafka Broker --------------> Flink: DataStream<String>
Flink: DataStream.map({
return ("AAABBBCCCDDD").convertTo("A: AAA; B: BBB; C: CCC; D: DDD")
})
.rebalance()
.addSink(MongoDBSinkFunction); // store the row in MongoDB collection
As you can see in this example I'm using Flink mostly for Kafka's message stream buffering and some basic parsing.