2

I have a Seq, x, and a Stream, y, and I wish to prepend x to y to obtain a new Stream. However, the static type of y is causing the Stream to be evaluated immediately, and I am confused why this is the case. Here is an example:

val x: Seq[Int] = Seq(1, 2, 3)
val y: Seq[Int] = Stream(4, 5, 6)
val z = x ++: y // z has dynamic type List instead of Stream

Since the ++: method is called on a Stream instance, I expect to get a Stream as a result, but instead I am getting a List as a result. Can someone please explain why this is happening?

DBear
  • 312
  • 2
  • 9
  • Yeah, I know `Stream` is deprecated. But I need to use an older version of scala to be compatible with a library I'm using. – DBear Jul 30 '21 at 14:33

1 Answers1

4

tl;dr

it's because of compiler type inference, and when you are using ++: on two Seq it's just construct another Seq. ++: creates builder which return type param is Seq, but default Seq builder is mutable.ListBuffer and it's return type is List[A] which is also Seq. So, by default it brakes laziness inside builder and return value will be List[Int] but return type will be Seq[Int].

Problem investigation

Lets watch to the ++: signature (for example in scala 2.12.10):

def ++:[B >: A, That](that: TraversableOnce[B])(implicit bf: CanBuildFrom[Repr, B, That]): That = {
    val b = bf(repr)
    if (that.isInstanceOf[IndexedSeqLike[_, _]]) b.sizeHint(this, that.size)
    b ++= that
    b ++= thisCollection
    b.result
  }

here we see implicit argument: bf: CanBuildFrom[Repr, B, That]. In line:

val b = bf(repr) // b is Builder[B, That]

here CanBuildFrom.apply called, it returns Builder[B, That]:

trait CanBuildFrom[-From, -Elem, +To] {
  def apply(from: From): Builder[Elem, To]
}

When we call ++: on two Seq[Int] we have default CanBuildFrom and newBuilder for sequences (from scala.collection.Seq):

object Seq extends SeqFactory[Seq] {
  /** $genericCanBuildFromInfo */
  implicit def canBuildFrom[A]: CanBuildFrom[Coll, A, Seq[A]] = ReusableCBF.asInstanceOf[GenericCanBuildFrom[A]]

  def newBuilder[A]: Builder[A, Seq[A]] = immutable.Seq.newBuilder[A]
}

we see, that newBuilder calls immutable.Seq.newBuilder from scala.collection.immutable.Seq:

object Seq extends SeqFactory[Seq] {
  /** genericCanBuildFromInfo */
  implicit def canBuildFrom[A]: CanBuildFrom[Coll, A, Seq[A]] = ReusableCBF.asInstanceOf[GenericCanBuildFrom[A]]
  def newBuilder[A]: Builder[A, Seq[A]] = new mutable.ListBuffer
}

We see mutable.ListBuffer which is not lazy.

Decision

So, to keep laziness while your concatenation you should pass your own CanBuildFrom for Stream[Int], something like that:

import scala.collection.generic.CanBuildFrom
import scala.collection.mutable
import scala.collection.mutable.Builder

val x: Seq[Int] = Seq(1, 2, 3)
val y: Seq[Int] = Stream(4, 5, 6)
implicit val cbf = new CanBuildFrom[Seq[Int], Int, Stream[Int]] {
  override def apply(from: Seq[Int]): Builder[Int, Stream[Int]] =
    new mutable.LazyBuilder[Int, Stream[Int]] {
      override def result() = from.toStream
    }

  override def apply(): mutable.Builder[Int, Stream[Int]] = Stream.newBuilder[Int]
}
val z = x ++:(y) // not it will be Stream(1, ?)

or you can just make streams from both sequences:

val x: Seq[Int] = Seq(1, 2, 3)
val y: Seq[Int] = Stream(4, 5, 6)
val z = x.toStream ++: y.toStream

and compiler will find implicit CanBuildFrom from Stream object, which is lazy:

implicit def canBuildFrom[A]: CanBuildFrom[Coll, A, Stream[A]] = new StreamCanBuildFrom[A]
Boris Azanov
  • 4,408
  • 1
  • 15
  • 28