3

I'm reading and having fun with examples and exercises contained in the book Functional Programming in Scala. I'm studing the strictess and laziness chapter talking about the Stream.

I can't understand the output produced by the following code excerpt:

sealed trait Stream[+A]{

  def foldRight[B](z: => B)(f: (A, => B) => B): B =   
  this match {
    case Cons(h,t) => f(h(), t().foldRight(z)(f))   
    case _ => z
  }

  def map[B](f: A => B): Stream[B] = foldRight(Stream.empty[B])((h,t) => {println(s"map h:$h"); Stream.cons(f(h), t)})

  def filter(f:A=>Boolean):Stream[A] = foldRight(Stream.empty[A])((h,t) => {println(s"filter h:$h"); if(f(h)) Stream.cons(h,t) else t})
}

case object Empty extends Stream[Nothing]

case class Cons[+A](h: () => A, t: () => Stream[A]) extends Stream[A]   

object Stream {
  def cons[A](hd: => A, tl: => Stream[A]): Stream[A] = {   
    lazy val head = hd   
    lazy val tail = tl
    Cons(() => head, () => tail)
  }
  def empty[A]: Stream[A] = Empty   

  def apply[A](as: A*): Stream[A] =   
    if (as.isEmpty) empty else cons(as.head, apply(as.tail: _*))
}

Stream(1,2,3,4,5,6).map(_+10).filter(_%2==0)

When I execute this code, I receive this output:

map h:1
filter h:11
map h:2
filter h:12

My questions are:

  1. Why map and filter output are interleaved?
  2. Could you explain all steps involved from the Stream creation until the last step for obtaining this behavior?
  3. Where are other elements of the list that pass also filter transformation, so 4 and 6?
Stein
  • 1,558
  • 23
  • 30
Giorgio
  • 1,073
  • 3
  • 15
  • 33
  • 1
    I think the last question is not correct. Like a stream, I will expect no print at all. You're not consuming the stream; you are only transforming it. – riccardo.cardin May 14 '18 at 12:25

1 Answers1

4

The key to understanding this behavior, I think, is in the signature of the foldRight.

def foldRight[B](z: => B)(f: (A, => B) => B): B = ...

Note that the 2nd argument, f, is a function that takes two parameters, an A and a by-name (lazy) B. Take away that laziness, f: (A, B) => B, and you not only get the expected method grouping (all the map() steps before all the filter() steps), they also come in reverse order with 6 processed first and 1 processed last, as you'd expect from a foldRight.

How does one little => perform all that magic? It basically says that the 2nd argument to f() is going to be held in reserve until it is required.

So, attempting to answer your questions.

  1. Why map and filter output are interleaved?

Because each call to map() and filter() are delayed until the point when the values are requested.

  1. Could you explain all steps involved from the Stream creation until the last step for obtaining this behavior?

Not really. That would take more time and SO answer space than I'm willing to contribute, but let's take just a few steps into the morass.

We start with a Stream, which looks likes a series of Cons, each holding an Int and a reference to the next Cons, but that's not completely accurate. Each Cons really holds two functions, when invoked the 1st produces an Int and the 2nd produces the next Cons.

Call map() and pass it the "+10" function. map() creates a new function: "Given h and t (both values), create a new Cons. The head function of the new Cons, when invoked, will be the "+10" function applied to the current head value. The new tail function will produce the t value as received." This new function is passed to foldRight.

foldRight receives the new function but the evaluation of the function's 2nd parameter will be delayed until it is needed. h() is called to retrieve the current head value, t() will be called to retrieve the current tail value and a recursive call to foldRight will be called on it.

Call filter() and pass it the "isEven" function. filter() creates a new function: "Given h and t, create a new Cons if h passes the isEven test. If not then return t." That's the real t. Not a promise to evaluate its value later.

  1. Where are other elements of the list that pass also filter transformation, so 4 and 6?

They are still there waiting to be evaluated. We can force that evaluation by using pattern matching to extract the various Cons one by one.

val c0@Cons(_,_) = Stream(1,2,3,4,5,6).map(_+10).filter(_%2==0)
//  **STDOUT**
//map h:1
//filter h:11
//map h:2
//filter h:12

c0.h()  //res0: Int = 12

val c1@Cons(_,_) = c0.t()
//  **STDOUT**
//map h:3
//filter h:13
//map h:4
//filter h:14

c1.h()  //res1: Int = 14

val c2@Cons(_,_) = c1.t()
//  **STDOUT**
//map h:5
//filter h:15
//map h:6
//filter h:16

c2.h()  //res2: Int = 16
c2.t()  //res3: Stream[Int] = Empty
jwvh
  • 50,871
  • 7
  • 38
  • 64
  • 1
    Thanks for your reply, so as far as I understand the output produced by `Stream(1,2,3,4,5,6).map(_+10).filter(_%2==0)` is correct because the chain stops as far the head is found(12), then all the other elements of the tail will be lazily evaluated. **But I still don't understand the interleaved behavior**, could you try to be more exhaustive about this please? **I can't see clearly the relationship between lazy evaluated and interleaved behavior** – Giorgio May 15 '18 at 15:42
  • Take this hint: the tail part in foldRight would never be forced to be evaluated if it wasn't for that .toList at the end. – Jimmy Page Sep 29 '19 at 15:28
  • @ jwvh From your answer, when you mentioned about function f, the 2nd parameter is "...a by-name (lazy) B..." But from the below topic, I think, that "...lazy B..." is by-need, NOT by-name Please refer to: https://stackoverflow.com/questions/50317282/understand-stream-scala-interleaved-transformations-behavior – datnt Nov 14 '19 at 03:59
  • @datnt; You've posted a link to this very same question so I'm not sure what "below topic" you suggest I "refer to". The parameter in question is declared as `=> B`. The established and accepted terminology for this Scala syntax is "passed by name" or simply a "by-name parameter" which is a parameter evaluated on reference. I don't much care for the terminology, I don't think it's very descriptive, but then they didn't ask me when choosing it. – jwvh Nov 14 '19 at 04:38
  • @ jwvh: I'm so sorry that posted the incorrect link. Here is the one which I intended to refer to: https://stackoverflow.com/questions/16407025/scala-stream-call-by-need-lazy-vs-call-by-name – datnt Nov 14 '19 at 07:02
  • 1
    @datnt; Thanks for the link, but "by-need" is not the common, accepted, terminology for this evaluation type. I would refer you to [this answer](https://stackoverflow.com/a/11992336/4993128), and the links it provides, to backup my assertions. – jwvh Nov 14 '19 at 07:22
  • @ jwvh: Thank you for your feedback. I totally realize that passed-by-name is the accepted, terminology. I think there's some confusion within the book's content but I could not figure out at the moment. Could you please shed a light on passing Stream.empty[B] as 1st parameter to foldRight. Whether passing 1st parameter for foldRight as Stream.empty[B] is just a trick to delay evaluation for function head, and function tail. I think the case for "case _ => z" only matter at the bottom (i.e at the end) of Stream. Other than that, I could not see the meaning of having Stream.empty[B] at all. – datnt Nov 14 '19 at 09:41
  • 1
    @datnt; SO comments are not the place for detailed discourse on deserving topics. But to address some of your points: passing `Stream.empty[_]` is not a trick and doesn't effect any delayed evaluations. All fold (Left/Right) operations require a "zero" element of the same type as the result. In most cases it's used at the beginning to start the folding iterations. In this case it is saved until the end to conclude the folding. You'd think that the `object Empty` could be used instead, since that's what every `Stream.empty` actually is, but that won't type-check. – jwvh Nov 15 '19 at 07:19