0

I’m getting started on a Beam project that reads from AWS Kinesis, so I have a simple DoFn that accepts a KinesisRecord and logs the contents. I want to write a unit test to run this DoFn and prove that it works. Unit testing with a KinesisRecord has proven to be challenging, though.

I get this error when I try to just use Create.of(testKinesisRecord):

java.lang.IllegalArgumentException: Unable to infer a coder and no Coder was specified. Please set a coder by invoking Create.withCoder() explicitly  or a schema by invoking Create.withSchema().

I have tried providing the KinesisRecordCoder explicitly using "withCoder" as the error suggests, but it’s a private class. Perhaps there's another way to unit test a DoFn?

Test code:

public class MyProjectTests {
    @Rule
    public TestPipeline p = TestPipeline.create();

    @Test
    public void testPoC() {
        var testKinesisRecord = new KinesisRecord(
                ByteBuffer.wrap("SomeData".getBytes()),
                "seq01",
                12,
                "pKey",
                Instant.now().minus(Duration.standardHours(4)),
                Instant.now(),
                "MyStream",
                "shard-001"
        );


        PCollection<Void> output =
                p.apply(Create.of(testKinesisRecord))
                        .apply(ParDo.of(new MyProject.PrintRecordFn()));

        var result = p.run();
        result.waitUntilFinish();
        result.metrics().allMetrics().getCounters().forEach(longMetricResult -> {
            Assertions.assertEquals(1, longMetricResult.getCommitted().intValue());
        });
    }
}

DoFn code:

  static class PrintRecordFn extends DoFn<KinesisRecord, Void> {
    private static final Logger LOG = LoggerFactory.getLogger(PrintRecordFn.class);
    private final Counter items = Metrics.counter(PrintRecordFn.class, "itemsProcessed");

    @ProcessElement
    public void processElement(@Element KinesisRecord element) {
      items.inc();

      LOG.info("Stream: `{}` Shard: `{}` Arrived at `{}`\nData: {}",
              element.getStreamName(),
              element.getShardId(),
              element.getApproximateArrivalTimestamp(),
              element.getDataAsBytes());
    }
  }
Alec
  • 490
  • 1
  • 4
  • 8

1 Answers1

1

KinesisRecordCoder is supposed to be used for internal purposes, so it is made package private. In the same time, you can provide custom AWSClientsProvider and use it to generate test data. As an example, please, take a look on KinesisMockReadTest and custom Provider

Alexey Romanenko
  • 1,353
  • 5
  • 11