0

I'm trying to design an Akka Stream using Alpakka to read events from kafka topic and put them to the Couchbase.

So far I have the following code and it seems to work somehow:

Consumer
      .committableSource(consumerSettings, Subscriptions.topics(topicIn))
      .map(profile ⇒ {
        RawJsonDocument.create(profile.record.key(), profile.record.value())
      })
      .via(
        CouchbaseFlow.upsertDoc(
          sessionSettings,
          writeSettings,
          bucketName
        )
      )
      .log("Couchbase stream logging")
      .runWith(Sink.seq)

By "somehow" I mean that the stream is actually reads events from topic and put them to Couchbase as json documents and it looks even nice despite a fact that I don't understand how to commit consumer offsets to Kafka.

If I've clearly understood the main idea that hides behind Kafka consumer offsets, in case of any failure or restart happens, the stream reads all messages from the last commited offset and, since we haven't committed any, it probably re-reads the records being read at previous session once again.

So am I right in my assumptions? If so, how to handle consumer commits in case of reading from Kafka and publishing to some database? The official Akka Streams documentation provides the examples showing how to deal with such cases using plain Kafka Streams, so I have no idea about how to committing the offsets in my case.

Great thanks!

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Alex Sergeenko
  • 642
  • 5
  • 22
  • Try passing the `CommittableOffset` throughout all function calls and then use a `Commiter.sink` to capture the offset and do the write back. – Dan W Oct 18 '19 at 13:59
  • So far I mainly got the idea that I have to provide a `CommitableOffset` to the related sink using `toMat` or `via(Committer.flow)`, but the problem is that I can't wrap my head around a possible implementation. – Alex Sergeenko Oct 18 '19 at 14:20
  • Why not use Kafka Connect plugin offered by Couchbase? – OneCricketeer Oct 26 '19 at 23:42
  • 1
    @cricket_007 I wasn't aware about it, will take a look, thanks! – Alex Sergeenko Oct 31 '19 at 07:04
  • https://docs.couchbase.com/server/4.5/connectors/kafka-3.1/quickstart.html#writing-data-with-sink-connector – OneCricketeer Oct 31 '19 at 10:04

1 Answers1

1

You will need to commit the offsets in Couchbase in order to obtain "exactly once" semantics.

This should help: https://doc.akka.io/docs/alpakka-kafka/current/consumer.html#offset-storage-external-to-kafka

João Guitana
  • 241
  • 1
  • 4
  • Hi! Thank you for the answer! I generally don't need an "exactly once" semantics in the case, I just want to make sure that my app won't re-read uncommited offset after restart. – Alex Sergeenko Oct 21 '19 at 06:34