3

For some reason, my Akka streams always wait for a second message before "emitting"(?) the first.

Here is some example code that demonstrates my problem.

val rx = Source((1 to 100).toStream.map { t =>
  Thread.sleep(1000)
  println(s"doing $t")
  t
})
rx.runForeach(println)

yields output:

doing 1
doing 2
1
doing 3
2
doing 4
3
doing 5
4
doing 6
5
...

What I want:

doing 1
1
doing 2
2
doing 3
3
doing 4
4
doing 5
5
doing 6
6
...
SGHAF
  • 105
  • 1
  • 1
  • 5
  • 1
    Akka stream always has a buffer of one element before every computation stage. Probably you are seeing that. – eiennohito Jul 11 '16 at 02:27

2 Answers2

5

The way your code is setup now, you are completely transforming the Source before it's allowed to start emitting elements downstream. You can clearly see that behavior (as @slouc stated) by removing the toStream on the range of numbers that represents the source. If you do that, you will see the Source be completely transformed first before it starts responding to downstream demand. If you actually want to run a Source into a Sink and have a transformation step in the middle, then you can try and structure things like this:

val transform =
  Flow[Int].map{ t =>
    Thread.sleep(1000)
    println(s"doing $t")
    t
  }

Source((1 to 100).toStream).
  via(transform ).
  to(Sink.foreach(println)).
  run

If you make that change, then you will get the desired effect, which is that an element flowing downstream gets processed all the way through the flow before the next element starts to be processed.

cmbaxter
  • 35,283
  • 4
  • 86
  • 95
  • I thought I would be able to simplify my problem and have it solved, but I think I oversimplified it. Your answer does fix the problem I posted though, I may have to make a new question with more details. – SGHAF Jul 10 '16 at 22:26
1

You are using .toStream() which means that the whole collection is lazy. Without it, your output would be first a hundred "doing"s followed by numbers from 1 to 100. However, Stream evaluates only the first element, which gives the "doing 1" output, which is where it stops. Next element will be evaluated when needed.

Now, I couldn't find any details on this in the docs, but I presume that runForeach has an implementation that takes the next element before invoking the function on the current one. So before calling println on element n, it first examines element n+1 (e.g. checks if it exists), which results in "doing n+1" message. Then it performs your println function on current element which results in message "n" .

Do you really need to map() before you runForeach? I mean, do you need two travelsals through the data? I know I'm probably stating the obvious, but if you just process your data in one go like this:

val rx = Source((1 to 100).toStream)
rx.runForeach({ t =>
  Thread.sleep(1000)
  println(s"doing $t")
  // do something with 't', which is now equal to what "doing" says
})

then you don't have a problem of what's evaluated when.

slouc
  • 9,508
  • 3
  • 16
  • 41
  • I've actually simplified the problem when I posted it. The stream is coming from a Java BufferedReader (which I get from a Socket). `Source(Stream.continually(myBufferedReader.readLine).map(t => (System.nanoTime, t)))` I was doing this originally in order to attempt to "timestamp" when my messages actually arrived at my machine, and again when they were finally processed. Thanks for the answer, though. This is all more clear to me now! – SGHAF Jul 10 '16 at 20:37