How do I hook up scalaz-streams to reactive streams (as in reactive-streams.org)

Question

I wanted to stream the data returned from a slick 3.0.0 query via db.stream(yourquery) through scalaz-stream.

It looks like reactive-streams.org is used an API and dataflow model that different libraries implement.

How do you do that with the back pressure flowing back from the scalaz-stream process to the slick publisher?

You could generalize the question to: How do I hook up scalaz-streams to reactive streams (as in reactive-streams.org) — cvogt, Feb 26 '15 at 07:28

score 3 · Answer 1 · answered Jan 14 '16 at 19:59

Take a look at https://github.com/krasserm/streamz

Streamz is a resource combinator library for scalaz-stream. It allows Process instances to consume from and produce to:

Apache Camel endpoints
Akka Persistence journals and snapshot stores and
Akka Stream flows (reactive streams) with full back-pressure support

user1763729 · Answer 2 · 2015-03-22T22:20:04.577

I finally did answer my own question. If you are willing to use scalaz-streams queues to queue up streaming results.

def getData[T](publisher: slick.backend.DatabasePublisher[T],
  queue: scalaz.stream.async.mutable.Queue[T], batchRequest: Int = 1): Task[scala.concurrent.Future[Long]] =
  Task {
    val p = scala.concurrent.Promise[Unit]()
    var counter: Long = 0
    val s = new org.reactivestreams.Subscriber[T] {
      var sub: Subscription = _

      def onSubscribe(s: Subscription): Unit = {
        sub = s
        sub.request(batchRequest)
      }

      def onComplete(): Unit = {
        sub.cancel()
        p.success(counter)
      }

      def onError(t: Throwable): Unit = p.failure(t)

      def onNext(e: T): Unit = {
        counter += 1
        queue.enqueueOne(e).run
        sub.request(batchRequest)
      }
    }
    publisher.subscribe(s)
    p.future
  }

When you run this using run you obtain a future that when finished, means the query finished streaming. You can compose on this future if you wanted your computation to wait for all the data to arrive. You could also add use an Await in the Task in getData then compose your computation on the returned Task object if you need all the data to run before continuing. For what I do, I compose on the future's completion and shutdown the queue so that my scalaz-stream knows to terminate cleanly.

score 0 · Answer 3 · answered Apr 27 '16 at 14:26

Here is a slightly different implementation (than the one posted by user1763729) which returns a Process:

def getData[T](publisher: DatabasePublisher[T], batchSize: Long = 1L): Process[Task, T] = {
 val q = async.boundedQueue[T](10)

 val subscribe = Task.delay {
  publisher.subscribe(new Subscriber[T] {

    @volatile var subscription: Subscription = _

    override def onSubscribe(s: Subscription) {
      subscription = s
      subscription.request(batchSize)
    }

    override def onNext(next: T) = {
        q.enqueueOne(next).attemptRun
        subscription.request(batchSize)
    }

    override def onError(t: Throwable) = q.fail(t).attemptRun

    override def onComplete() = q.close.attemptRun
  })
 }

 Process.eval(subscribe).flatMap(_ => q.dequeue)
}

How do I hook up scalaz-streams to reactive streams (as in reactive-streams.org)

3 Answers3