0

I'm writing a KStreams integration test on top of an embedded kafka instance (a la spring-kafka-test, see this example). I have two pairs of topics, each pair's first topic feeding directly into the second topic, and each pair processing one record. I make a KTable from each topic pair's latter topic, and leftjoin the two Ktables.

The only time the ValueJoiner runs, the right-hand-side is null. Even though both records reach their respective ktables (verified).

My EmbeddedKafka instance is running with 2 brokers and 3 partitions per topic. Here are the streams/producers/consumers configurations:

properties.put(StreamsConfig.APPLICATION_ID_CONFIG, appName);
properties.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapUrls);
properties.put(StreamsConfig.STATE_DIR_CONFIG, String.format("/tmp/kafka-streams/%s/%s", appName, System.currentTimeMillis()));
properties.put(StreamsConfig.CACHE_MAX_BYTES_BUFFERING_CONFIG, 0);
properties.put(StreamsConfig.CLIENT_ID_CONFIG, appName);
properties.put(StreamsConfig.COMMIT_INTERVAL_MS_CONFIG, 1000);
properties.put(StreamsConfig.REPLICATION_FACTOR_CONFIG, 1);
properties.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
properties.put(ConsumerConfig.GROUP_ID_CONFIG, appName);
properties.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, KafkaAvroDeserializer.class);
properties.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, KafkaAvroDeserializer.class);
properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, KafkaAvroSerializer.class);
properties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, KafkaAvroSerializer.class);

NOTICE CACHING IS DISABLED (4th line)

Here is a sanitized version of the code in question:

    KTable<Long, Value> kTableA =
        kstreamBuilder.table(longSerde, valueSerde, topicA2);

    kstreamBuilder.stream(keySerde, envelopeSerde, topicA1)
        .to(longSerde, valueSerde, topicA2);

    kstreamBuilder.stream(keySerde, envelopeSerde, topicB1)
        .to(longSerde, valueSerde, topicB2.topicName);

    KTable<Long, Value> kTableB =
        kstreamBuilder.table(longSerde, valueSerde, topicB2.topicName);

    KTable<Long, Result> resultTable = kTableA.leftJoin(kTableB, (a,b) -> {
        // value joiner called only once, b is null
    }

This post is related to my other post, where TOO MANY records are being produced. Fiddling with the number of kafka brokers / partitions per topic led me to this equally undesirable situation.

Thanks in advance for any and all help!

Freestyle076
  • 1,548
  • 19
  • 36
  • So, I've been fiddling with a few test cases with a KTable+KTable joining, and it turns out that the problem has something to do with a stream feeding directly into a ktable. I've succeeded with leftjoins and joins on setups with simply two KTables, no preceding streams. – Freestyle076 Jul 19 '18 at 20:54
  • Further fiddling discoveries: when I reduce the number of partitions per topic from 3 to 1 (number of brokers fixed at 3) the correct behavior results: left-null, left-right. Can anyone explain the behavior of ktable joins when multiple partitions are present? – Freestyle076 Jul 20 '18 at 15:53

0 Answers0