0

We are using com.spotify.scio.testing.JobTest for end to end testing of our scio pipeline. The pipeline includes a DoFn that is sensitive to data sequencing, on a stream of configuration data which arrives infrequently.

We are passing an ordered List of configuration values combinedSampleConfig as input to the JobTest Builder. Is there a way to have JobTest preserve the ordering of this CustomIO input stream when we run an end to end test?

I see that the testing framework enables fine control over source arrival time (using advanceProcessingTime) when testing individual components, but do not see how to apply this for end to end testing using JobTest.

    JobTest[MyApp.type]
      .args(commonArgs ++ Seq(
        "--numWorkers=1",
        "--maxNumWorkers=1",
      ): _*
      )
      .input(CustomIO[PubsubMessage](CONFIG_ID), combinedSampleConfig)
      .input(CustomIO[IndicatorEntry](INPUT_ID), sampleInput)
      .output(CustomIO[EnrichedIndicatorEntry](AGG_ID)) {
        _ should containInAnyOrder (expectedAggs)
      }
      .output(CustomIO[EnrichedIndicatorEntry](EVENT_ID)) {
        _ should containInAnyOrder (expectedEvents)
      }
      .run()
  • 1
    There isn't a way right now but probably doable by allowing Beam `TestStream` (which supports event timing) as input. I filed https://github.com/spotify/scio/issues/1891. – Neville Li May 02 '19 at 17:02

1 Answers1

0

https://github.com/spotify/scio/pull/1905

This PR was recently merged and should allow such use case. Can you give it a try?

Neville Li
  • 420
  • 3
  • 10