I have flink application, with 48 parallelism(1 jobManager, 3taskManagers) and almost 2300-2400 tasks.
But sometimes flink can't consume kafka records quickly, and this causes latency.
In the graphs there is no backpressure in any task (I got the results from prometheus integration, flink_taskmanager_job_task_isBackPressured)
I am using mainly rocksdb to store state, only 5-6 streams (with 48 parallelism) are using registerProcessingTimeTimer()
There are no checkpoint & savepoint operations
What can be the problem? (or should i add new node to the cluster?)
Object from kafka records contains 24 primitive fields, 1 complex object (include 15 primitive fields), 1 map<String,String> data type, 1 another complex object(include 8 primitive fields),