1

I have a slightly strange use case where our applications are not using standard kafka partitioning. Instead we have a custom partitioning strategy, where we use a specific field within a compound key to decide how to partition. This is generally the CustomerId, so that all records for a single customer are contained within a single partition, however the key also contains the other Ids that make the message unique so that compaction still works.

e.g.

topic-1-key

{
  orderId,
  customerId
}

topic-2-key

{
  addressId,
  customerId
}

I want to join these 2 records together, in order to do this with the DSL, my only option is to rekey both records to the customer Id, and do the join. However when I do this, Kafka-streams automatically decides the key-changing operations have occurred, and creates repartition topics for me. Is there any way to override this behaviour whilst using the DSL?

I'm aware I could do this manually using the processor api and state stores, but wondered if there's a way to do it with the DSL, or if its not an option.

M21B8
  • 1,867
  • 10
  • 20

1 Answers1

1

It's not possible right now, ie, up to Apache Kafka 3.6.

There is already WIP to add a new operator markAsPartitioned() to close this gap. KIP-759 is already accepted and will most likely ship with 3.7.0 release.

Matthias J. Sax
  • 59,682
  • 7
  • 117
  • 137
  • Thanks for the info! Looking forward to this release then to massively speed up our topologies! – M21B8 Aug 29 '23 at 10:15