0

I'm attempting to create a Flow to be used with a Source queue. I would like this to work with the Alpakka Google PubSub connector: https://doc.akka.io/docs/alpakka/current/google-cloud-pub-sub.html

In order to use this connector, I need to create a Flow that depends on the topic name provided as a String, as shown in the above link and in the code snippet.

val publishFlow: Flow[PublishRequest, Seq[String], NotUsed] =
  GooglePubSub.publish(topic, config)

The question

I would like to be able to setup a Source queue that receives the topic and message required for publishing a message. I first create the necessary PublishRequest out of the message String. I then want to run this through the Flow that is instantiated by running GooglePubSub.publish(topic, config). However, I don't know how to get the topic to that part of the flow.

val gcFlow: Flow[(String, String), PublishRequest, NotUsed] = Flow[(String, String)]
  .map(messageData => {
    PublishRequest(Seq(
      PubSubMessage(new String(Base64.getEncoder.encode(messageData._1.getBytes))))
      )
    })
  .via(GooglePubSub.publish(topic, config))

val bufferSize = 10
val elementsToProcess = 5

// newSource is a Source[PublishRequest, NotUsed]
val (queue, newSource) = Source
  .queue[(String, String)](bufferSize, OverflowStrategy.backpressure)
  .via(gcFlow)
  .preMaterialize()

I'm not sure if there's a way to get the topic into the queue without it being a part of the initial data stream. And I don't know how to get the stream value into the dynamic Flow.

If I have improperly used some terminology, please keep in mind that I'm new to this.

Jeffrey Chung
  • 19,319
  • 8
  • 34
  • 54
Jacob Goodwin
  • 354
  • 3
  • 7

1 Answers1

1

You can achieve it by using flatMapConcat and generating a new Source within it:

// using tuple assuming (Topic, Message)
val gcFlow: Flow[(String, String), (String, PublishRequest), NotUsed] = Flow[(String, String)]
    .map(messageData => {
      val pr = PublishRequest(immutable.Seq(
        PubSubMessage(new String(Base64.getEncoder.encode(messageData._2.getBytes)))))
      // output flow shape of (String, PublishRequest)
      (messageData._1, pr)
    })

val publishFlow: Flow[(String, PublishRequest), Seq[String], NotUsed] =
Flow[(String, PublishRequest)].flatMapConcat {
    case (topic: String, pr: PublishRequest) =>
      // Create a Source[PublishRequest]
      Source.single(pr).via(GooglePubSub.publish(topic, config))
  }

// wire it up
val (queue, newSource) = Source
    .queue[(String, String)](bufferSize, OverflowStrategy.backpressure)
    .via(gcFlow)
    .via(publishFlow)
    .preMaterialize()

Optionally you could substitute tuple with a case class to document it better

case class Something(topic: String, payload: PublishRequest)

// output flow shape of Something[String, PublishRequest]
Something(messageData._1, pr)

Flow[Something[String, PublishRequest]].flatMapConcat { s =>
  Source.single(s.payload)... // etc
}

Explanation:

In gcFlow we output FlowShape of tuple (String, PublishRequest) which is passed through publishFlow. The input is tuple (String, PublishRequest) and in flatMapConcat we generate new Source[PublishRequest] which is flowed through GooglePubSub.publish

There would be slight overhead creating new Source for every item. This shouldn't have measurable impact on performance

1565986223
  • 6,420
  • 2
  • 20
  • 33
  • Thanks for your clear explanation. In most cases I can create a Flow with the topic known ahead of time, but this can come in handy in some cases. – Jacob Goodwin Sep 03 '19 at 23:55