I'm rethinking our Spring MVC application behavior, whether it's better to pull (Java8 Stream) data from the database or let the database push (Reactive / Observable) it's data and use backpressure to control the amount.
Current situation:
User
requests the 30 most recent articlesService
does a database query and puts the 30 results into aList
Jackson
iterates over theList
and generates the JSON response
Why switch the implementation?
It's quite memory consuming, because we keep those 30 objects in memory all the time. That's not needed, because the application processes one object at a time. Though the application should be able to retrieve one object, process it, throw it away, and get the next one.
Java8 Streams? (pull)
With java.util.Stream
this is quite easy: The Service
creates a Stream
, which uses a database cursor behind the scenes. And each time Jackson
has written the JSON String for one element of the Stream
, it will ask for the next one, which then triggers the database cursor to return the next entry.
RxJava / Reactive / Observable? (push)
Here we have the opposite scenario: The database has to push entry by entry and Jackson
has to create the JSON String for each element until the onComplete
method has been called.
i.e. the Controller
tells the Service
: give me an Observable<Article>
. Then Jackson
can ask for as many database entries as it can process.
Differences and concern:
With Streams
there's always some delay between asking for next database entry and retrieving / processing it. This could slow down the JSON response time if the network connection is slow or there is a huge amount of database requests that have to be made to fulfill the response.
Using RxJava
there should be always data available to process. And if it's too much, we can use backpressure to slow down the data transfer from database to our application. In the worst case scenario the buffer/queue will contain all requested database entries. Then the memory consumption will be equal to our current solution using a List
.
Why am I asking / What am I asking for?
What did I miss? Are there any other pros / cons?
Why did (especially) the Spring Data Team extend their API to support
Stream
responses from the database, if there's always a (short) delay between each database request/response? This could sum up to some noticeable delay for a huge amount of requested entries.Is it recommended to go for
RxJava
(or some other reactive implementation) for this scenario? Or did I miss any drawbacks?