tldr: Using spring, reactive, kafka, webflux, coroutines, need to somehow connect receiver Flux
to coroutine Flow
, but are GlobalScope
and Dispatcher.Unconfined
the right tools for the job?
The context of the question:
I am using spring, reactive kafka, webflux and coroutines in my project.
My task is to consume a message from the kafka topic, then call the webflux client, retrieve a response and do some login based on the client response.
The main issue I encountered is how to convert Flux
of receiver records into a coroutine Flow
.
The solution I ended up with is the following piece of code
@OptIn(DelicateCoroutinesApi::class)
override suspend fun connect() = receiver.receive()
.groupBy { it.receiverOffset().topicPartition() }
.asFlow().onEach { partition ->
partition.asFlow().onEach { record ->
handleRecord(record)
}
.flowOn(Dispatchers.Unconfined).launchIn(GlobalScope)
}.flowOn(Dispatchers.Unconfined).launchIn(GlobalScope)
Where I launch connect function for every consumer using PostConstruct
where consumers is a list of beans implementing connect function
@OptIn(DelicateCoroutinesApi::class)
@PostConstruct
fun connectAll() = consumers.forEach {consumer->
GlobalScope.launch {
consumer.connect()
}
}
Now the way I understand it works is the following:
connectAll()
launches every connect method in a separate coroutine in a global scope, so they all run async, without meddling with each other and this is what I need, considering consumers should work independently.- then every connect method receives a
Flux
ofReceiverRecords
and groups them by partition - every
GroupedFlux
of partition to Records is run in theGlobalScope
using whatever thread to dispatch coroutine. If it fails for some reason, it wont crash the other partition to records groups - every record within partition to records
GroupedFlux
is run in a separate coroutine inGlobalScope
, unconfined to a particular thread
The problem with unconfined dispatcher is that it may block a thread if a coroutine blocks on a blocking piece of code, which should not be the case in my app, since it is supposed to use non blocking stack.
However, I still do have a question, if I should change some of the dispatchers in my code.
Is it more beneficial to use Dispatcher.IO
for the client calls performed in handleRecord
method?
I would not say, that the record handling method is very CPU consuming to use Dispatcher.Default
, so it's probably either of the 2 above.
The other thing that is confusing to me, if GlobalScope
is the right tool for the job.
The doc states, that it is delicate api and you should avoid using it. On the other hand it does accomplish my goal of running a consumer throughout the application lifecycle.
Furthermore, the coroutines, where the records are handled are not inherited by the one processing the partitions. So if a records happens to produce an exception, then it will not affect other records and partitions.
As I see it, the GlobalScope
allows me to run consumers, without worrying, that an exception from one of them might disturb the other consumer.
The same thing with partitions and records.
But maybe it is more beneficial to have your own context for that kind of a task?
And is it possible for the records within 1 partition to be processed in offset order, when using this scope.