When reading about Kafka and how to get data from Kafka to a queryable database suited for some specific task, there is usually mention of Kafka Connect sinks.
This sounds like the way to go if I needed Kafka to search indexing like ElasticSearch or analytics like Hadoop to Spark where there's a Kafka Connect sink available.
But my question is what is the best way to handle a store that isn't as popular say MyImaginaryDB, where the only way I can get to it is through some API, and the data needs to be handled securely and reliably, as well as decently transformed before inserting? Is it recommended to:
- Just have the API consume from Kafka and use the MyImaginaryDB driver to write
- Figure out how to build a custom Kafka Connect sink (assuming it can handle schemas, authentication/authorization, retries, fault-tolerance, transforms and post-processing needed before landing in MyImaginaryDB)
I have also been reading about Kafka KSQL and Streams and am wondering if that helps with transforming the data before it is sent to the end store.