I have a flink application running on 400 taskmanagers. Within 1hour window time i get same keys more frequently, let's say out of 1500 unique keys 50 keys will occurs more often. This is making like few taskmangers alone process more amount than other. If 390 taskmanagers are processing 50MB per minute other 10 are processing 10GB per minute. This is making the system very slow. Can we share same key with multiple taskmanagers if the load is high. How can i solve my issue here.
Asked
Active
Viewed 79 times
1 Answers
0
If you want finer grained partitioning of your data, you'll need to find a way to sub-divide the current keys. Depending on what you're doing, it might make sense to add a preprocessing layer before aggregating at the level of the current keys (for example).

David Anderson
- 39,434
- 4
- 33
- 60