4

I am learning and evaluating sparks and Flink before picking one of them for a project that I got.

In my evaluation I came up with the following simple tasks, that I can figure out how to implement it in both framework.

Let say that

1-/ I have a stream of events that are simply information about the fact that some item have changed somewhere in a database.

2-/ I need for each of those event, to query the db to get the new version of the item

3-/ apply some transformation

4-/connect to another Db and write that results.

My question here is as follow:

Using Flink or Sparks, how can one make sure that the calls to the dbs are handle asynchronously to avoid thread starvation?

I come from scala/Akka, where typically we avoid to make blocking calls and use future all the ways for this kind of situation. Akka stream allows that fine grain level of detail for stream processing for instance Integrating stream with external service. This avoid thread starvation. While I wait in my io operation the thread can be used for something else.

In short I don't see how to work with futures in both frameworks.

So I believe that somehow this can be reproduce with both frameworks.

Can anyone please explain how this is supposed to be handled in Flink or sparks.

If this is not supported out of the box, does anyone has experience with getting it incorporated somehow.

MaatDeamon
  • 9,532
  • 9
  • 60
  • 127
  • The question is a bit open-ended. Only thing I can suggest is to implement a prototype in all technologies and see what works best. – Rüdiger Klaehn May 26 '16 at 08:09

1 Answers1

0

Since version 1.2.0 of Flink, you can now use the Async I/O API to achieve this.

bp2010
  • 2,342
  • 17
  • 34