0

I have a list of streams List[Stream[_]], size of list is known at the beginning of function, size of each stream equals n or n+1. I'd like to obtain interleave stream e.g.

def myMagicFold[A](s: List[Stream[A]]): Stream[A]

val streams = List(Stream(1,1,1),Stream(2,2,2),Stream(3,3),Stream(4,4)) 

val result = myMagicFold(streams)

//result = Stream(1,2,3,4,1,2,3,4,1,2)

I'm using fs2.Stream. My first take:

val result = streams.fold(fs2.Stream.empty){
   case (s1, s2) => s1.interleaveAll(s2)
}

// result = Stream(1, 4, 3, 4, 2, 3, 1, 2, 1, 2)

I'm looking for a solution based on basic operations (map, fold,...)

jker
  • 465
  • 3
  • 13
  • Not really, as i mentiond at the end of the queation i'm looking for a generic solution – jker Oct 01 '19 at 22:49
  • Anyways, serious question about the problem. You said the number of Streams in the list is known before calling the function. That means at compile time? Like it may be simpler to create overloaded versions of the function for 2 to N streams, if N is always going to be a small number. – Luis Miguel Mejía Suárez Oct 01 '19 at 22:52
  • unfortunately it isn't known at compile time, moreover N might be big – jker Oct 01 '19 at 22:55
  • After thinking for a while, I doubt this can be done. At least not in generically _(it may be possible tough, I do not know too many things)_. I think you may write a custom `fs2.Pull`, but that would be far from general and it still seems very hard to do. Nevertheless, I would recommend you to also ask the [fs2 **gitter** channel](https://gitter.im/functional-streams-for-scala/fs2) if this is possible to do in anyway, they will know how. - PS: If you solve this, remember you can answer your own questions. – Luis Miguel Mejía Suárez Oct 01 '19 at 23:17

3 Answers3

3

Your initial guess was good, however interleaveAll flattens too soon, so that's why you don't get expected order. Here's the code that should do what you try to achieve:


  def zipAll[F[_], A](streams: List[Stream[F, A]]): Stream[F, A] =
    streams
      .foldLeft[Stream[F, List[Option[A]]]](Stream.empty) { (acc, s) =>
        zipStreams(acc, s)
      }
      .flatMap(l => Stream.emits(l.reverse.flatten))

  def zipStreams[F[_], A](s1: Stream[F, List[Option[A]]], s2: Stream[F, A]): Stream[F, List[Option[A]]] =
    s1.zipAllWith(s2.map(Option(_)))(Nil, Option.empty[A]) { case (acc, a) => a :: acc }

In this case, you're adding n-th element of each stream into the list and then convert to the Stream which is later flattened to the result stream. Since fs2.Stream is pull-based you only have one list in memory at a time.

J Markus
  • 58
  • 4
1

Here is an attempt, it works as expected...

import cats.effect.IO
import cats.implicits._
import fs2.Stream

def myMagicFold[A](streams: List[Stream[IO, A]]): Stream[IO, A] =
  Stream.unfoldEval(streams) { streams =>
    streams.traverse { stream =>
      stream.head.compile.last
    } map { list =>
      list.sequence.map { chunk =>
        Stream.emits(chunk) -> list.map(_.tail)
      }
    }
  }.flatten

However, this is far from a good solution, it is extremely inefficient, since it reevaluates each Stream on each step.
You can confirm that with this code:

def stream(name: String, n: Int, value: Int): Stream[IO, Int] =
  Stream
    .range(start = 0, stopExclusive = n)
    .evalMap { i =>
      IO {
        println(s"${name} - ${i}")
        value
      }
    }
    
val list = List(stream("A", 3, 1), stream("B", 2, 2), stream("C", 3, 3))
myMagicFold(list).compile.toList.unsafeRunAsync(println)

Which will print

A - 0
B - 0
C - 0
A - 0
A - 1
B - 0
B - 1
C - 0
C - 1
A - 0
A - 1
A - 2
B - 0
B - 1
C - 0
C - 1
C - 2

Right(List(1, 2, 3, 1, 2, 3))

I am pretty sure this can be fixed using Pulls, but I do not have any experience with that.

Community
  • 1
  • 1
  • 1
    Thanks for your time, currently I'm working on a solution based on Pulls. I'll post my solution as soon as I make it with immutable state – jker Oct 02 '19 at 16:23
0

Just to iterate over the accepted solution, the below is basically the same, but it does not use Lists internally, only Streams (plus it saves that final reverse on each of the Lists).

The only thing that needs to be known eagerly remains the number of Streams to be interleaved (hence the streams: List[Stream[F, A]]):

def interleaveAll[F[_], A](streams: List[Stream[F, A]]): Stream[F, A] =
    streams
      .foldLeft[Stream[F, Stream[F, Option[A]]]](Stream.empty) { (acc, s) =>
        interleaveStreams(acc, s)
      }
      .flatMap(l => l.unNone)

private def interleaveStreams[F[_], A](
    s1: Stream[F, Stream[F, Option[A]]],
    s2: Stream[F, A],
  ): Stream[F, Stream[F, Option[A]]] =
    s1.zipAllWith(s2.map(Option(_)))(Stream.empty, Option.empty[A]) { case (acc, a) => acc ++ Stream(a) }
mdm
  • 3,928
  • 3
  • 27
  • 43