I'm performance benchmarking my Flink application that reads data from Kafka, transforms it and dumps it into another Kafka topic. I need to keep the context so messages with same order-id are not treated as brand new orders. I'm extending RichFlatMapFunction class with ValueState to achieve that. As I understand, I'll need to use KeyStream before I can call flatMap:
env.addSource(source()).keyBy(Order::getId).flatMap(new OrderMapper()).addSink(sink());
The problem is keyBy is taking very long time from my prespective (80 to 200 ms). I say keyBy is taking because if I remove keyBy and replace flatMap with a map function, 90th percentile of latency is about 1ms. Is there a way to use state/context without using keyBy or maybe make keyBy fast somehow?