I'm writing a KStreams integration test on top of an embedded kafka instance (a la spring-kafka-test, see this example). I have two pairs of topics, each pair's first topic feeding directly into the second topic, and each pair processing one record. I make a KTable from each topic pair's latter topic, and leftjoin the two Ktables.
The only time the ValueJoiner runs, the right-hand-side is null. Even though both records reach their respective ktables (verified).
My EmbeddedKafka instance is running with 2 brokers and 3 partitions per topic. Here are the streams/producers/consumers configurations:
properties.put(StreamsConfig.APPLICATION_ID_CONFIG, appName);
properties.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapUrls);
properties.put(StreamsConfig.STATE_DIR_CONFIG, String.format("/tmp/kafka-streams/%s/%s", appName, System.currentTimeMillis()));
properties.put(StreamsConfig.CACHE_MAX_BYTES_BUFFERING_CONFIG, 0);
properties.put(StreamsConfig.CLIENT_ID_CONFIG, appName);
properties.put(StreamsConfig.COMMIT_INTERVAL_MS_CONFIG, 1000);
properties.put(StreamsConfig.REPLICATION_FACTOR_CONFIG, 1);
properties.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
properties.put(ConsumerConfig.GROUP_ID_CONFIG, appName);
properties.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, KafkaAvroDeserializer.class);
properties.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, KafkaAvroDeserializer.class);
properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, KafkaAvroSerializer.class);
properties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, KafkaAvroSerializer.class);
NOTICE CACHING IS DISABLED (4th line)
Here is a sanitized version of the code in question:
KTable<Long, Value> kTableA =
kstreamBuilder.table(longSerde, valueSerde, topicA2);
kstreamBuilder.stream(keySerde, envelopeSerde, topicA1)
.to(longSerde, valueSerde, topicA2);
kstreamBuilder.stream(keySerde, envelopeSerde, topicB1)
.to(longSerde, valueSerde, topicB2.topicName);
KTable<Long, Value> kTableB =
kstreamBuilder.table(longSerde, valueSerde, topicB2.topicName);
KTable<Long, Result> resultTable = kTableA.leftJoin(kTableB, (a,b) -> {
// value joiner called only once, b is null
}
This post is related to my other post, where TOO MANY records are being produced. Fiddling with the number of kafka brokers / partitions per topic led me to this equally undesirable situation.
Thanks in advance for any and all help!