1

I have a use case where I receive the click-streams and I need to do certain computes using data from Cassandra and finally push the computed value to Redshift.

For this use case for streaming and compute I'm exploring the required Tech Stack. Is it possible to use Kafka Streams lib ?

If someone who has used this , can throw light on possible pros/cons or any other suggestion.

Bankelaal
  • 408
  • 1
  • 9
  • 24

1 Answers1

0

In case of Kafka Streams you'll need to pull data from Cassandra "manually" - perform query from the inside of your code using the just normal session.execute, or use the Object Mapper.

As alternative you can look to the Apache Spark that allows to work with both Kafka's streaming data & data in Cassandra (via Spark Cassandra Connector). Lookup of data in Cassandra is quite common task when you need to enrich streaming data with data from database - you can do join with data in Cassandra, and then implement your calculations based on the pulled data. If you want to have concrete examples, look to my blog post on the efficient join with data in Cassandra.

If you'll look to Spark, use Spark Structured Streaming as it heavily simplify development of such applications.

Alex Ott
  • 80,552
  • 8
  • 87
  • 132